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Title of the Invention 

NUCLEOTIDE AND AMINO AOD SEQUENCES OF THE ENVELOPE 1 GENE OF 51 HEPATITIS 
C VIRUS ISOLATES AND THE USE OF REAGENTS DERIVED THEREFROM AS DIAGNOSTIC 
REAGENTS AND VACCINES 



35 



Field Of Invention 
The present invention is in the field of 
hepatitis virology. The invention relates to the complete 
nucleotide and deduced amino acid sequences of the envelope 
1 (El) gene of 51 hepatitis C virus (HCV) isolates from 
around the world and the grouping of these isolates into 
twelve distinct HCV genotypes. More specifically, this 
invention relates to oligonucleotides, peptides and 
recombinant proteins derived from the envelope 1 gene 
sequences of the 51 isolates of hepatitis C virus and to 
diagnostic methods and vaccines which employ these 
reagents . 

Background Of Invention 
Hepatitis C, originally called non-A, non-B 
hepatitis, was first described in 1975 as a disease 
serologically distinct from hepatitis A and hepatitis B 
(Feinstone, S.M. et al. (1975) N. Engl. J. Med. 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of trans fusion- associated hepatitis as well as an important 
part of community- acquired hepatitis, little progress was 
made in understanding the disease until the recent 
identification of hepatitis C virus (HCV) as the causative 
agent of hepatitis C via the cloning and sequencing of the 
HCV genome (Choo, A.L. et al. (1989) Science 288:359-362). 
The sequence information generated by this study resulted 
in the characterization of HCV as a small, enveloped, 
positive- stranded RNA virus and led to the demonstration 
that HCV is a major cause of both acute and chronic 
hepatitis worldwide (Weiner, A.J. et al. (1990) Lancet 
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° 335:1-3) . These observations, combined with studies 
showing that over 50% of acute cases of hepatitis C 
progress to chronicity with 20% of these resulting in 
cirrhosis said an undetermined proportion progressing to 
liver cancer, have led to tremendous efforts by 
5 investigators within the hepatitis C field to develop 

diagnostic assays and vaccines which can detect and prevent 
hepatitis C infection. 

The cloning and sequencing of the HCV genome by 
Choo et al. (1989) has permitted the development of 

10 serologic tests which can detect HCV or antibody to HCV 

(Kuo, 6. et al. (1989) Science 244:362-364) * In addition, 
the work of Choo et al. has also allowed the development of 
methods for detecting HCV infection via amplif ication of 
HCV RNA sequences by reverse transcription and cDNA 

15 polymerase chain reaction (RT-PCR) using primers derived 
from the HCV genomic sequence (Weiner, A.J. et al.). 
However, although the development of these diagnostic 
methods has resulted in improved diagnosis of HCV 
infection, only approximately 60% of cases of hepatitis C 

20 are associated with a factor identified as contributing to 
transmission of HCV (Alter, M.J. et al. (1989) JAMA 
262:1201-1205). This observation suggests that effective 
control of hepatitis C transmission is likely to occur only 
via universal pediatric vaccination as has been initiated 

25 recently for hepatitis B virus. Unfortunately, attempts to 
date to protect chimpanzees from hepatitis C infection via 
administration of recombinant vaccines have had only ' 
limited success. Moreover, the apparent genetic 
heterogeneity of HCV, as indicated by the recent assignment 

30 of all available HCV isolates to one of four genotypes, I- 
IV (Okamoto, H. et al. (1992) J. Gen. Virol; 73:673-679), 
presents additional hurdles which must be overcome in order 
to develop accurate and effective diagnostic assays and 
vaccines . 

35 For example, one possible obstacle to the 
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0 development of effective hepatitis C vaccines would arise 
if the observed genetic heterogeneity of HCV reflects 
serologic heterogeneity. In such a case, the most 
genetically diverse strains of HCV may then represent 
different serotypes of HCV with the result being that 
5 infection with one strain may not protect against infection 
with another. Indeed, the inability of one strain to 
protect against infection with another strain was recently 
noted by both Farci et al. (Farci, P. et al. (1992) Science 
258:135-140) and Prince et al. (Prince, A.M. et al. (1992) 

10 J. Infect. Dis. 165:438-443), each of whom presented 

evidence that while infection with one strain of HCV does 
modify the degree of the hepatitis C associated with the 
reinfection, it does not protect against reinfection with a 
closely related strain. The genetic heterogeneity among 

15 different HCV strains also increases the difficulty 

encountered in developing RT-PCR assays to detect HCV 
infection since such heterogeneity often results in false- 
negative results because of primer and template mismatch. 
In addition, currently used serologic tests for detection 

20 of HCV or for detection of antibody to HCV are not 

sufficiently well developed to detect all of the HCV 
genotypes which might exist in a given blood sample. 
Finally, in terms of choosing the proper treatment modality 
to combat hepatitis infection, the inability of presently 

25 available serologic assays to distinguish among the various 
genotypes of HCV represents a significant shortcoming in 
that recent reports suggest that an HCV- infected patient's 
response to therapy might be related to the genotype of the 
infectious virus (Yoshioka, K. et al. (1992) Hepatology 

30 16:293-299; Kanai, K. et al. (1992) Lancet 339:1543; Lan, 

J.Y.N, et al. (1992) Hepatology 16:209A) . Indeed, the data 
presented in the above studies suggest that the closely 
related genotypes I and II are less responsive to 
interferon therapy. than are the closely related genotypes 

35 III and TV. Moreover, preliminary data by Pozzato et al. 
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(Pozzato, G. et al. (1991) Lancet 338:509) suggests that 
different genotypes may be associated with different types 
or degrees of clinical disease. Taken together, these 
studies suggest that before effective vaccines against HCV 
infection can be developed, and indeed, before more 
5 accurate and effective methods for diagnosis and treatment 
of HCV infection can be produced, one must obtain a greater 
knowledge about the genetic and serologic diversity of HCV 
isolates . 

In a recent attempt to gain an understanding of 

10 the extent of genetic heterogeneity among HCV strains, Bukh 
et al. carried out a detailed analysis of HCV isolates via 
the use of PCR technology to amplify different regions of 
the HCV genome (Bukh, J. et al. (1992a) Proc. Natl. Acad. 
Sci. 89:187-191). Following PCR amplification, the 5'- 

15 noncoding (5' NC) portion of the genomes of various HCV 

isolates were sequenced and it was found that primer pairs 
designed from conserved regions of the 5' NC region of the 
HCV genome were more sensitive for detecting the presence 
of HCV than were primer pairs representing other portions 

20 of the genome (Bukh, J. et al. (1992b) Proc. Natl. Acad. 

Sci; U.S.A. 89:4942-4946). In addition, the authors noted 
that although many of the HCV isolates examined could be 
classified into the four genotypes described by Okamoto et 
al. (1992) , other previously undescribed genotypes emerged 

25 based on genetic heterogeneity observed in the 5' NC region 
of the various isolates. One of the most prominent of 
these newly noted genotypes comprised a group of related 
viruses that contained the most genetically divergent 5' NC 
regions of those studied. This group of viruses, 

30 tentatively classified as a fifth genotype, are very 

similar to strains recently described by others (Cha, T.-A 
et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:7144-7148; 
Chan, S-W. et al. (1992) J. Gen. Virol., 73:1131-1141 and 
Lee, C-H et al. (1992) J. Clin, Microbio. 30:1602-1604). 

35 In addition, at least four more putative genotypes were 
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identified thereby providing evidence that the genetic 
heterogeneity of HCV was more extensive than previously 
appreciated. 

However, while the studies of Bukh et al. {1992a 
and b) provided new and useful information on the genetic 
5 heterogeneity of HCV, it is widely appreciated by those 
skilled in the art that the three structural genes of HCV, 
core (C) , envelope (El) and envelope 2 /nonstructural 1 
(E2/NS1) are the most important for the development of 
serologic diagnostics and vaccines since it is the product 

10 of these genes that constitutes the hepatitis C virion. 

Thus, a determination of the nucleotide sequence of one or 
all of the structural genes of a variety of HCV isolates 
would be useful in designing reagents for use in diagnostic 
assays and vaccines since a demonstration of genetic 

15 heterogeneity in a structural gene(s) of HCV isolates might 
suggest that some of the HCV genotypes represent distinct 
serotypes of HCV based upon the previously observed 
relationship between genetic heterogeneity and serologic 
heterogeneity among another group of single- stranded, 

20 positive- sense RNA viruses, the picornaviruses (Ruechert, 
R.R. "Picornaviridae and their replication", in Fields, 
B.N. et al., eds. Virology, New York: Raven Press, Ltd. 
(1990) 507-548) . 



25 Summary of Invention 

The present invention relates to 51 cDNAs , each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 

The present invention also relates to the nucleic 
30 acid and deduced amino acid sequences of these El cDNAs. 

It is an object of this invention to provide 
synthetic nucleic acid sequences capable of directing 
production of recombinant El proteins, as well as 
equivalent natural nucleic acid sequences. Such natural 
35 nucleic acid sequences may be isolated from a cDNA or 
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° genomic library from which the gene capable of directing 
synthesis of the El proteins may be identified and 
isolated. For purposes of this application, nucleic acid 
sequence refers to RNA, DNA, cDNA or any synthetic variant 
thereof which encodes for peptides. 
5 The invention also relates to the method of 

preparing recombinant El proteins derived from the El cDNA 
sequences by cloning the nucleic acid and inserting the 
cDNA into an expression vector and expressing the 
recombinant protein in a host cell. 

10 The invention also relates to isolated and 

substantially purified recombinant El proteins and analogs 
thereof encoded by the El cDNAs. 

The invention further relates to the use of 
recombinant El proteins as diagnostic agents and as 

15 vaccines . 

The invention also relates to the use of single- 
stranded antisense poly- or oligonucleotides derived from 
the El cDNAs to inhibit the expression of the hepatitis C 
El gene. 

20 The invention further relates to multiple 

computer- generated alignments of the nucleotide and deduced 
amino acid sequences of the 51 El cDNAs. These multiple 
sequence alignments serve to highlight regions of homology 
said non-homology between different sequences and hence, can 

25 be used by one skilled in the art to design peptides and 
oligonucleotides useful as reagents in diagnostic assays 

and vaccines . 

The invention therefore also relates to purified 
and isolated peptides and analogs thereof derived from El 
30 cDNA sequences. 

The invention further relates to the use of these 
peptides as diagnostic agents and vaccines. 

The present invention also encompasses methods of 
detecting antibodies specific for hepatitis C virus in 
35 biological samples. The methods of detecting HCV or 
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antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by HCV 
and for monitoring the progression of such disease. Such 
methods are also useful for monitoring the efficacy of 
therapeutic agents during the course of treatment of HCV 
5 infection and disease in a mammal. 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
sample where said kit contains at least one purified and 
isolated peptide derived from the El cDNA sequences. 
10 The invention further provides isolated and 

purified genotype -specific oligonucleotides and analogs 
thereof derived from El cDNA . sequences . 

The invention also relates to a method for 
detecting the presence of hepatitis C virus in a mammal, 
15 said method comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. The invention further 
relates to a method for determining the genotype of 
hepatitis C virus present in a mammal. This method is 
useful in determining the proper course of treatment for an 

20 HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
sequences useful as primers for reverse- transcription 

25 polymerase chain reaction (RT-PCR) analysis of RNA for the 
presence of hepatitis C virus. 

The invention further provides a diagnostic kit 
for the determination of the genotype of a hepatitis C 
virus present in a mammal. The kit comprises purified and 

30 isolated nucleic acid sequences useful as primers for RT- 
PCR analysis of RNA for the presence of HCV in a biological 
sample and purified and isolated nucleic acid sequences 
useful as hybridization probes in determining the genotype 
of the HCV isolate detected in PCR. 

35 This invention also relates to pharmaceutical 
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° compositions for use in prevention or treatment of 
hepatitis C in a mammal. 

Description of Figures 
Figures 1 A-H show computer generated sequence 
5 alignments of the nucleotide sequences of the 51 HCV El 
cDNAs. The single letter abbreviations used for the 
nucleotides shown in Figures 1A-H are those standardly used 
in the art- Figure 1A shows the alignment of SEQ ID N0s:l- 
8 to produce a consensus sequence for genotype I/la. 

10 Figure IB shows the alignment of SEQ ID N0s:9-25 to produce 
a consensus sequence for genotype Il/lb. Figure 1C shows 
the alignment of SEQ ID N0s:26-29 to produce a consensus 
sequence for genotype III/2a. Figure ID shows the 
alignment of SEQ ID NOs: 30-33 to produce a consensus 

15 sequence for genotype IV/2b. Figure IE shows the alignment 
of SEQ ID NOs: 35 -39 to produce a consensus sequence for 
genotype V/3a. Figure IF shows the computer alignment of 
SEQ ID NOs: 42 -43 to produce a consensus sequence for 
genotype 4C. Figure 1G shows the alignment of SEQ ID 

20 NOs: 45-50 to produce a consensus sequence for genotype 5a. 
The nucleotides shown in capital letters in the consensus 
sequences of Figures 1A-G are those conserved within a 
genotype while nucleotides shown in lower case letters in 
the consensus sequences are those variable within a 

25 genotype. In addition, in Figures 1A-E and 1G, when the 
lower case letter is shown in a consensus sequence, the 
lower case letter represents the nucleotide found most 
frequently in the sequences aligned to produce the 
consensus sequence. In Figure IE, the lower case letters 

30 shown in the consensus sequence are nucleotides in SEQ ID 
NO: 42 which differ from nucleotides found in the same 
positions in SEQ ID N0:43. Finally, a hyphen at a 
nucleotide position in the consensus sequences in Figures 
1A-6 indicates that two nucleotides were found in equal 

35 numbers at that position in the aligned sequences. In the 
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° aligned sequences, nucleotides are shown in lower case 
letters if they differed from the nucleotides of both 
adjacent isolates. Figure 1H shows the alignment of the 
consensus sequences of Figures 1A-G with SEQ ID NO: 34 
(genotype 2c), SEQ ID NO:40 (genotype 4a), SEQ ID NO:41 
5 (genotype 4b), SEQ ID NO: 44 (genotype 4d) and SEQ ID NO: 51 
(genotype 6a) to produce a consensus sequence for all 
twelve genotypes. This consensus sequence is shown as the 
bottom line of Figure 1H where the nucleotides shown in 
capital letters are conserved among all genotypes and a 

10 blank space indicates that the nucleotide at that position 
is not conserved among all genotypes - 

Figures 2A-H show computer alignments of the 
deduced amino acid sequences of the 51 HCV El cDNAs. The 
single letter abbreviations used for the amino acids shown 

15 in Figures 2A-H follow the conventional amino acid 

shorthand for the twenty naturally occurring amino acids. 
Figure 2A shows the alignment of SEQ ID NOs: 52 -59 to 
produce a consensus sequence for genotype I/la. Figure 2B 
shows the alignment of SEQ ID NOs: 60 -76 to produce a 

20 consensus sequence for genotype Il/lb. Figure 2C shows the 
^alignment of SEQ ID NOs: 77 -80 to produce a consensus 
sequence for genotype III/2a. Figure 2D shows the 
alignment of SEQ ID N0s:81-84 to produce a consensus 
sequence for genotype IV/ 2b. Figure 2E shows the alignment 

25 of SEQ ID NOs: 86-90 to produce a consensus sequence for 

genotype V/3a. Figure 2F shows the computer alignment of 
SEQ ID NOs: 93 -94 to produce a consensus sequence for 
genotype 4c. Figure 2G shows the alignment of SEQ ID 
NOs: 96-101 to produce a consensus sequence for genotype 5a. 

30 The amino acids shown in capital letters in the consensus 
sequences of Figures 2A-G are those conserved within a 
genotype while amino acids shown in lower case letters in 
the consensus sequences are those variable within a 
genotype. In addition, in Figures 2A-E and 2G when the 

35 lower case letter is shown in a consensus sequence, the 
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0 letter represents the amino acid found most frequently in 
the sequences aligned to produce the consensus sequence . 
In Figure 2E, the lower case letters shown in the consensus 
sequence are amino acids in SEQ ID NO: 93 which differ from 
amino acids found in the same positions in SEQ ID NO: 94 . 
5 Finally, a hyphen at an amino acid position in the 

consensus sequences of Figures 2A-G indicates that two 
amino acids were found in equal numbers at that position in 
the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case letters if they differed from 

10 the amino acids of both adjacent isolates. Figure 2H shows 
the alignment of the consensus sequences of Figures 1A-G 
with SEQ ID NO: 85 (genotype 2c), SEQ ID N0:91 (genotype 
4a), SEQ ID NO:92 (genotype 4b), SEQ ID N0:95 (genotype 4d) 
and SEQ ID NO: 102 (genotype 6a) to produce a consensus 

15 sequence for all twelve genotypes. This consensus sequence 
is shown as the bottom line of Figure 2H where the amino 
acids shown in capital letters are conserved among all 
genotypes and a blank space indicates that the amino acid 
at that position is not conserved among all genotypes. 

20 Figure 3 shows multiple sequence alignment of the 

deduced amino acid sequence of the El gene of 51 HCV 
isolates collected worldwide. The consensus sequence of 
the El protein is shown in boldface (top) . In the 
consensus sequence cysteine residues are highlighted with 

25 stars, potential N- linked glycosylation sites are 

underlined, and invariant amino acids are capitalized, 
whereas variable amino acids are shown in lower case 
letters. In the alignment, amino acids are shown in lower 
case letters if they differed from the amino acid of both 

30 adjacent isolates. Amino acid residues shown in bold print 
in the alignment represent residues which at that position 
in the amino acid sequence are genotype - specif ic . Amino 
acids that were invariant among all HCV isolates are shown 
as hyphens {-) in the alignment. Amino acid positions 

35 correspond to those of the HCV prototype sequence (HCV-1, 
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° Choo, L. et al. (1991) Proc. Natl. Acad, Sci. USA 88:2451- 
2455) with the first amino acid of the El protein at 
position 192. The grouping of isolates into 12 genotypes 
(I/la, II/lb r III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a 
and 6a) is indicated. 
5 Figure 4 shows a dendrogram of the genetic 

relatedness of the twelve genotypes of HCV based on the 
percent amino acid identity of the El gene of the HCV 
genome. The twelve genotypes shown are designated as I/la, 
Il/lb, III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a and 6a. 

10 The shaded bars represent a range showing the maximum and 
minimum homology between the amino acid sequence of any one 
isolate of the genotype indicated and the amino acid 
sequence of any other isolate. 

Figure 5 shows the distribution of the complete 

15 El gene sequence of 74 HCV isolates into the twelve HCV 

genotypes in the 12 countries studied. For 51 of these HCV 
isolates, including 8 isolates of genotype I/la, 17 
isolates of genotype Il/lb and 26 isolates comprising the 
additional 10 genotypes, the complete El gene sequence was 

20 determined. In the remaining 23 isolates, all of genotypes 
I/la and Il/lb, the genotype assignment was based on only a 
partial El gene sequence. The partially sequenced isolates 
did not represent additional genotypes in any of the 12 
countries. The number of isolates of a particular genotype 

25 is given in each of the 12 countries studied. For ease of 
viewing, those genotypes designated by two terms (e.g., 
I/la) are indicated by the latter term (e.g. la) . The 
designations used for each country are: Denmark (DK) ; 
Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India 

30 (IND); Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; 

Sweden (SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 
National borders depicted in this figure represent those 
existing at the time of sampling. 



35 
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° Detailed Description Of In vention 

The present invention relates to 51 cDNAs, each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 
The cDNAs of the present invention were obtained as 
5 follows. Viral RNA was extracted from serum collected from 
humans infected with hepatitis C virus and the viral RNA 
was then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
the HCV strain H-77 (Ogata, N. et al. (1991) Proc. Natl. 
10 Acad, Sci. U.S.A. 88:3392-3396). The amplified cDNA was 
then isolated by gel electrophoresis and sequenced. 

The present invention further relates to the 
nucleotide sequences of the cDNAs encoding the El gene of 
the 51 HCV isolates. These nucleotide sequences are shown 
15 in the sequence listing as SEQ ID NO:l through SEQ ID 
N0:51. 

The abbreviations used for the nucleotides are 
those standardly used in the art. 

The deduced amino acid sequence of each of SEQ ID 
20 NO:l through SEQ ID NO: 51 are presented in the sequence 
listing as SEQ ID NO: 52 through SEQ ID NO: 102 where the 
amino acid sequence in SEQ ID NO: 52 is deduced from the 
nucleotide sequence shown in SEQ ID N0:1, the amino acid 
sequence shown in SEQ ID NO: 53 is deduced from the 
25 nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos:52-l02 
starts at nucleotide 1 of the corresponding sequence shown 
in SEQ ID N0s:l-51 and extends 595 nucleotides. 

The three letter abbreviations used in SEQ ID 
30 Nos:52-102 follow the conventional amino acid shorthand for 
the twenty naturally occurring amino acids. 

Preferably, the El proteins or peptides of the 
present invention are substantially homologous to, and most 
preferably biologically equivalent to, the native HCV El 
35 proteins or peptides. By "biologically equivalent" as used 
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throughout the specification and claims, it ia meant that 
the compositions are immunogenically equivalent to the 
native El proteins or peptides. The El proteins or 
peptides of the present invention may also stimulate the 
production of protective antibodies upon injection into a 
5 mammal that would serve to protect the mammal upon 

challenge with HCV. By "substantially homologous" as used 
throughout the ensuing specification and claims to describe 
El proteins and peptides, it is meant a degree of homology 
in the amino acid sequence to the native El proteins or 
10 peptides. Preferably the degree of homology is in excess 
of 90, preferably in excess of 95, with a particularly 
preferred group of proteins being in excess of 99 
homologous with the native El proteins or peptides. 

Variations are contemplated in the cDNA sequences 

15 shown in SEQ ID NO:l through SEQ ID NO: 51 which will result 
in a DNA sequence that is capable of directing production 
of analogs of the corresponding envelope l (El) protein 
shown in SEQ ID NO: 52 through SEQ ID NO: 102. It should be 
noted that the DNA sequences set forth above represent a 

20 preferred embodiment of the present invention. Due to the 
degeneracy of the genetic code, it is to be understood that 
numerous choices of nucleotides may be made that will lead 
to a DNA sequence capable of directing production of the 
instant El protein or its analogs. As such, DNA sequences 

25 which are functionally equivalent to the sequence set forth 
above or which are functionally equivalent to sequences 
that would direct production of analogs of the El proteins 
produced pursuant to the amino acid sequences set forth 
above, are intended to be encompassed within the present 

30 invention. 

The term analog as used throughout the 
specification or claims to describe the El proteins or 
peptides of the present invention, includes any protein or 
peptide having an amino acid residue sequence substantially 
35 identical to a sequence specifically shown herein in which 
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° one or more residues have been conservatively substituted 
with a biologically equivalent residue. Examples of 
conservative substitutions include the substitution of one- 
polar (hydrophobic) residue such as isoleucine, valine, 
leucine or methionine for another, the substitution of one 
5 polar (hydrophilic) residue for another such as between 
arginine and lysine, between glut amine and asparagine, 
between glycine and serine, the substitution of one basic 
residue such as lysine, arginine or histidine for another, 
or the substitution of one acidic residue, such as aspartic 

10 acid or glutamic acid for another. 

The phrase "conservative substitution" also 
includes the use of a chemically derivatized residue in 
place of a non-derivatized residue provided that the 
resulting protein or peptide is biologically equivalent to 

15 the native El protein or peptide. 

"Chemical derivative" refers to an El protein or 
peptide having one or more residues chemically derivatized 
by reaction of a functional side group. Examples of such 
derivatized molecules, include but are not limited to, 

20 those molecules in which free amino groups have been 
derivatized to form amine hydrochlorides, p- toluene 
sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl 
groups, chloracetyl groups or formyl groups. Free carboxyl 
groups may be derivatized to form salts, methyl and ethyl 

25 esters or other types of esters or hydrazides. Free 

hydroxyl groups may be derivatized to form 0-acyl or O- 
alkyl derivatives. The imidazole nitrogen of histidine may 
be derivatized to form N-imbenzylhistidine. Also included 
as chemical derivatives are those proteins or peptides 

30 which contain one or more naturally- occurring amino acid 
derivatives of the twenty standard amino acids. For 
examples: 4-hydroxyproline may be substituted for proline; 
5-hydroxylysine may be substituted for lysine; 3- 
methylhistidine may be substituted for histidine; 

35 homoserine may be substituted for serine; and ornithine may 
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° be substituted for lysine. The El protein or peptide of 
the present invention also includes any protein or peptide 
having one or more additions and/or deletions or residues 
relative to the sequence of a peptide whose sequence is 
shown herein, so long as the peptide is biologically 
5 equivalent to the native El protein or peptide . 

The present invention also includes a recombinant 
DNA method for the manufacture of HCV El proteins. In this 
method, natural or synthetic nucleic acid sequences may be 
used to direct the production of El proteins. 
10 In one embodiment of the invention, the method 

comprises : 

(a) preparation of a nucleic acid sequence 
capable of directing a host organism to produce HCV El 
protein; 

15 (b) cloning the nucleic acid sequence into a 

vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 
elements for the nucleic acid sequence; 

(c) transferring the vector containing the 

20 nucleic acid and operational elements into a host organism 
capable of expressing the protein; 

(d) culturing the host organism under conditions 
appropriate for amplif ication of the vector and expression 
of the protein; and 

25 (e) harvesting the protein. 

In another embodiment of the invention, the 
method for the recombinant DNA synthesis of an HCV El 
protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID N0s:l-51 comprises: 

30 (a) culturing a transformed or transfected host 

organism containing a nucleic acid sequence capable of 
directing the host organism to produce a protein, under 
conditions such that the protein is produced, said protein 
exhibiting substantial homology to a native El protein 

35 isolated from HCV having the amino acid sequence according 
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0 to any one of the amino acid sequences shown in SEQ ID 
N0s:52-102 or combinations thereof. 

In one embodiment , the RNA sequence of an HCV 
isolate was isolated and cloned to cDNA as follows. Viral 
RNA is extracted from a biological sample collected from 
5 human subjects infected with hepatitis C and the viral RNA 
is then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
HCV strain H-77 (Ogata et al. (1991)). Preferred primer 
sequences are shown as SEQ ID NOs:103-108 in the sequence 

10 listing. Once amplified, the PCR fragments are isolated by 
gel electrophoresis and sequenced. 

The vectors contemplated for use in the present 
invention include any vectors into which a nucleic acid 
sequence as described above can be inserted, along with any 

15 preferred or required operational elements, and which 

vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are those whose restriction sites have been well 
documented and which contain the operational elements 

20 preferred or required for transcription of the nucleic acid 
sequence . 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 
least one leader sequence, at least one terminator codon, 

25 and any other DNA sequences necessary or preferred for 

appropriate transcription and subsequent translation of the 
vector nucleic acid. In particular, it is contemplated 
that such vectors will contain at least one origin of 
replication recognized by the host organism along with at 

30 least one selectable markers and at least one promoter 

sequence capable of initiating transcription of the nucleic 
acid sequence. 

In construction of the recombinant for expression 
cloning vector of the present invention, it should 

35 additionally be noted that multiple copies of the nucleic 
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acid sequence and its attendant operational elements may be 
inserted into each vector. In such an embodiment, the host 
organism would produce greater amounts per vector of the 
desired El protein. The number of multiple copies of the 
DNA sequence which may be inserted into the vector is 
5 limited only by the ability of the resultant vector due to 
its size, to be transferred into and replicated and 
transcribed in an appropriate host microorganism. 

In another embodiment, restriction digest 
fragments containing a coding sequence for El proteins can 

10 be inserted into a suitable expression vector that 

functions in prokaryotic or eukaryotic cells. By suitable 
is meant that the vector is capable of carrying and 
expressing a complete nucleic acid sequence coding for El 
protein. Preferred expression vectors are those that 

15 function in a eukaryotic cell. Examples of such vectors 
include but are not limited to vaccinia virus vectors, 
adenovirus or herpes viruses. A preferred vector is the 
baculovirus transfer vector, pBlueBac. 

In yet another embodiment, the selected 

20 recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or Cv-1. A preferred eukaryotic cell system is 

25 SF9 insect cells. 

The expressed recombinant protein may be detected 
by methods known in the art including, but not limited to, 
Coomassie blue staining and Western blotting. 

The present invention also relates to 

30 substantially purified and isolated recombinant El 

proteins. In one embodiment, the recombinant protein 
expressed by the SF9 cells can be obtained as a crude 
lysate or it can be purified by standard protein 
purification procedures known in the art which may include 

35 differential precipitation, molecular sieve chromatography, 
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0 ion- exchange chromatography, isoelectric focusing, gel 
electrophoresis and affinity and immunoaf f inity 
chromatography. The recombinant protein may be purified by 
passage through a column containing a resin which has bound 
thereto antibodies specific for the open reading frame 
5 (ORF) protein. 

The present invention further relates to the use 
of recombinant El proteins as diagnostic agents and 
vaccines. In one embodiment, the expressed recombinant 
proteins of this invention can be used in immunoassays for 

10 diagnosing or prognosing hepatitis C in a mammal. For the 
purposes of the present invention, "mammal' 1 as used 
throughout the specification and claims, includes, but is 
not limited to humans, chimpanzees, other primates and the 
like. In a preferred embodiment, the immunoassay is useful 

15 in diagnosing hepatitis C infection in humans. 

Immunoassays of the present invention may be a 
radioimmunoassay, Western blot assay, immunof luorescent 
assay, enzyme immunoassay, chemiluminescent assay, 
immunohistochemical assay and the like. Standard 

20 techniques known in the art for ELISA are described in 

Methods in Immunodiacmosis . 2nd Edition, Rose and Bigazzi, 
eds., John Wiley and Sons, 1980 and Campbell et al., 
Methods of Immunology , W.A. Benjamin, Inc., 1964, both of 
which are incorporated herein by reference. Such assays 

25 may be a direct, indirect, competitive, or noncompetitive 
immunoassay as described in the art (Oellerich, M. 1984. J. 
Clin, Chem. Clin. BioChem 22:895-904) Biological samples 
appropriate for such detection assays include, but are not 
limited to serum, liver, saliva, lymphocytes or other 

30 mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surface -bound recombinant 
HCV El protein as an antigen. The solid surface reagent 
can be prepared by known techniques for attaching protein 

35 to solid support material. These attachment methods 
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° include non-specific adsorption of the protein to the 

support or covalent attachment of the protein to a reactive 
group on the support. After reaction of the antigen with 
anti-HCV antibody, unbound serum components are removed by 
washing and the antigen -antibody complex is reacted with a 
5 secondary antibody such as labelled anti- human antibody. 
The label may be an enzyme which is detected by incubating 
the solid support in the presence of a suitable 
fluorimetric or calorimetric reagent. Other detectable 
labels may also be used, such as radiolabels or colloidal 

10 gold, and the like. 

The HCV El protein and analogs thereof may be 
prepared in the form of a kit, alone, or in combinations 
with other reagents such as secondary antibodies, for use 
in immunoassays. 

15 In yet another embodiment the recombinant El 

proteins or analogs thereof can be used as a vaccine to 
protect mammals against challenge with Hepatitis C. The 
vaccine, which acts as an immunogen, may be a cell, cell 
lysate from cells transfected with a recombinant expression 

20 vector or a culture supernatant containing the expressed 
protein. Alternatively, the immunogen is a partially or 
substantially purified recombinant protein. 

While it is possible for the immunogen to be 
administered in a pure or substantially pure form, it is 

25 preferable to present it as a pharmaceutical composition, 
formulation or preparation. 

The formulations of the present invention, both 
for veterinary and for human use, comprise an immunogen as 
described above, together with one or more pharmaceutical^ 

30 acceptable carriers and optionally other therapeutic 

ingredients. The carrier (s) must be "acceptable" in the 
sense of being compatible with the other ingredients of the 
formulation and not deleterious to the recipient thereof. 
The formulations may conveniently be presented in unit 

35 dosage form and may be prepared by any method well-known in 
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the pharmaceutical art* 

All methods include the step of bringing into 
association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In general, 
the formulations are prepared by uniformly and intimately 
5 bringing into association the active ingredient with liquid 
carriers or finely divided solid carriers or both r and 
then, if necessary, shaping the product into the desired 
formulation. 

Formulations suitable for intravenous 

10 intramuscular, subcutaneous , or intraperitoneal 

administration conveniently comprise sterile aqueous 
solutions of the active ingredient with solutions which are 
preferably isotonic with the blood of the recipient. Such 
formulations may be conveniently prepared by dissolving the 

15 solid active ingredient in water containing physiologically 
compatible substances such as sodium chloride (e.g. 0.1- 
2.0m) , glycine, and the like, and having a buffered pH 
compatible with physiological conditions to produce an 
aqueous solution, and rendering said solution sterile. 

20 These may be present in unit or multi-dose containers, for 
example, sealed ampoules or vials. 

The formulations of the present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.11-10,000 parts 

25 by weight per part by weight of immunogens. If two or more 
stabilizers are to be used, their total amount is 
preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 

30 pressure of such aqueous solutions is generally in the 

range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9.0, preferably within the range of 
6-8. In formulating the immunogen of the present 

35 invention, anti- adsorption agent may be used. 
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° Additional pharmaceutical methods may be employed 

to control the duration of action. Controlled release 
preparations may be achieved through the use of polymer to 
complex or adsorb the proteins or their derivatives. The 
controlled delivery may be exercised by selecting 
5 appropriate macromolecules (for example polyester, 
polyamino acids, polyvinyl pyrrol idone, 
ethyl enevinylacetate , methylcellulose , 
carboxymethyl cellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 

10 incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled- release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 

15 polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 

20 interfacial polymerization, for example, 

hydroxymethylcellulose or gelatin-microcapsules and poly 
(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microemulsions, nanoparticles, and 

25 nanocapsules or in macroemulsions. 

When oral preparations are desired, the 
compositions may be combined with typical carriers, such as 
lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 

30 cellulose, glycerin, sodium alginate or gum arabic among 
others . 

The proteins of the present invention may be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above. 
35 Vaccination can be conducted by conventional 



•WO 95/01442 



FCT/US94/07320 



- 22 - 

0 methods. For example, the immunogen or immunogens (i.e. 
the El protein may be administered alone or in combination 
with the El proteins derived from other isolates of HCV) 
can be used in a suitable diluent such as saline or water, 
or complete or incomplete adjuvants. Further, the 
5 immunogen (s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KLH) , tetanus 
toxoid, and the like. The immunogen (s) can be administered 

10 by any route appropriate for antibody production such as 
intravenous , intraperitoneal , intramuscular, subcutaneous , 
and the like. The immunogen (s) may be administered once or 
at periodic intervals until a significant titer of ant i- HCV 
antibody is produced. The antibody may be detected in the 

15 serum using an immunoassay. 

The administration of the immunogen (s) of the 
present invention may be for either a prophylactic or 
therapeutic purpose. When provided prophylactically, the 
immunogen (s) is provided in advance of any exposure to HCV 

20 or in advance of any symptom of any symptoms due to HCV 
infection. The prophylactic administration of the 
immunogen serves to prevent or attenuate any subsequent 
infection of HCV in a mammal. When provided 
therapeutically, the immunogen (s) is provided at (or 

25 shortly after) the onset of the infection or at the onset 
of any symptom of infection or disease caused by HCV. The 
therapeutic administration of the immunogen (s) serves to 
attenuate the infection or disease. 

In addition to use as a vaccine, the compositions 

30 can be used to prepare antibodies to HCV El proteins. The 
antibodies can be used directly as antiviral agents. To 
prepare antibodies, a host animal is immunized using the El 
proteins native to the virus particle bound to a carrier as 
described above for vaccines . The host serum or plasma is 

35 collected following an appropriate time interval to provide 
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° a composition comprising antibodies reactive with the El 

protein of the virus particle. The gamma globulin fraction 
or the IgG antibodies can be obtained, for example, by use 
of saturated ammonium sulfate or DEAE Sephadex, or other 
techniques known to those skilled in the art* The 
5 antibodies are substantially free of many of the adverse 
side effects which may be associated with other anti -viral 
agents such as drugs. 

The antibody compositions can be made even more 
compatible with the host system by minimizing potential 

10 adverse immune system responses. This is accomplished by 
removing all or a portion of the Pc portion of a foreign 
species antibody or using an antibody of the same species 
as the host animal, for example, the use of antibodies from 
human/human hybridomas. Humanized antibodies (i.e., 

15 nonimmunogenic in a human) may be produced, for example, by 
replacing an immunogenic portion of an antibody with a 
corresponding, but nonimmunogenic portion (i.e., chimeric 
antibodies) . Such chimeric antibodies may contain the 
reactive or antigen binding portion of an antibody from one 

20 species and the Pc portion of an antibody (nonimmunogenic) 
from a different species. Examples of chimeric antibodies, 
include but are not limited to, non-human mammal - human 
chimeras, rodent -human chimeras, murine -human and rat -human 
chimeras (Robinson et al.. International Patent Application 

25 184,187; Taniguchi M. , European Patent Application 171,496; 
Morrison et al., European Patent Application 173,494; 
Neuberger et al., PCT Application WO 86/01533; Cabilly et 
al., 1987 Proc. Natl. Acad. Sci. USA 84:3439; Nishimura et 
al., 1987 Came. Res. 47:999; Wood et al., 1985 Nature 

30 314:446; Shaw et al., 1988 J. Natl. Cancer Inst. 80:15553, 
all incorporated herein by reference) . 

General reviews of "humanized" chimeric 
antibodies are provided by Morrison S. f 1985 Science 
229:1202 and by Oi et al., 1986 BioTechniques 4:214. 

35 Suitable "humanized" antibodies can be 
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alternatively produced by CDR or CEA substitution (Jones et 
al., 1986 Nature 321:552; Verhoeyan et al., 1988 Science 
239:1534; Biedleret al. 1988 J. Immunol. 141:4053, all 
incorporated herein by reference) . 

The antibodies or antigen binding fragments may 
5 also be produced by genetic engineering. The technology 
for expression of both heavy and light cain genes in E. 
coli is the subject of the PCT patent applications; 
publication number WO 901443, W0901443, and WO 9014424 and 
in Huse et al., 1989 Science 246:1275-1281. 

10 The antibodies can also be used as a means of 

enhancing the immune response. The antibodies can be 
administered in amount similar to those used for other 
therapeutic administrations of antibody. For example, 
normal immune globulin is administered at 0*02-0.1 ml/lb 

15 body weight during the early incubation period of other 

viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HCV El protein can be passively 
administered alone or in conjunction with another anti- 

20 viral agent to a host infected with an HCV to enhance the 
immune response and/or the effectiveness of an antiviral 
drug. 

Alternatively, ant i- HCV El antibodies can be 
induced by administered anti-idiotype antibodies as 

25 immunogens. Conveniently, a purified anti-HCV El antibody 
preparation prepared as described above is used to induce 
anti-idiotype antibody in a host animal, the composition is 
administered to the host animal in a suitable diluent. 
Following administration, usually repeated administration, 

30 the host produces anti-idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti-idiotype antibody in the host 

35 animal, serum or plasma is removed to provide an antibody 
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composition. The composition can be purified as described 
above for anti-HCV El antibodies, or by affinity 
chromatography using anti-HCV El antibodies bound to the 
affinity matrix. The anti-idiotype antibodies produced are 
similar in conformation to the authentic HCV El protein and 
5 may be used to prepare an HCV vaccine rather than using an 
HCV El protein. 

When used as a means of inducing anti-HCV virus 
antibodies in an animal, the manner of injecting the 
antibody is the same as for vaccination purposes, namely 
10 intramuscularly, intraperitoneally, subcutaneously or the 
like in an effective concentration in a physiologically 
suitable diluent with or without adjuvant. One or more 
booster injections may be desirable. 

The HCV El proteins of the invention are also 
15 intended for use in producing antiserum designed for pre- 
or post -exposure prophylaxis. Here an El protein, or 
mixture of El proteins is formulated with a suitable 
adjuvant and administered by injection to human volunteers, 
according to known methods for producing human antisera. 
20 Antibody response to the injected proteins is monitored, 
during a several -week period following immunization, by 
periodic serum sampling to detect the presence of anti-HCV 
El serum antibodies, using an immunoassay as described 
herein . 

25 The antiserum from immunized individuals may be 

administered as a pre- exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating an individual post- 
exposure, analogous to the use of high titer antiserum 

30 against hepatitis B virus for post -exposure prophylaxis. 

For both in vivo use of antibodies to HCV virus - 
like particles and proteins and anti-idiotype antibodies 
and diagnostic use, it may be preferable to use monoclonal 
antibodies. Monoclonal anti-HCV El protein antibodies or 

35 anti-idiotype antibodies can be produced as follows. The 
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° spleen or lymphocytes from an immunized animal are removed 
and immortalized or used to prepare hybridomas by methods 
known to those skilled in the art. (Goding, J.W. 1983. 
Monoclonal Antibodies: Principles and Practice, Pladermic 
Press, Inc., NY, NY, pp. 56-97). To produce a human-human 
5 hybridoma, a human lymphocyte donor is selected. A donor 
known to be infected with HCV (where infection has been 
shown for example by the presence of anti -virus antibodies 
in the blood or by virus culture) may serve as a suitable 
lymphocyte donor. Lymphocytes can be isolated from a 

10 peripheral blood sample or spleen cells may be used if the 
donor is subject to splenectomy. Epstein- Barr virus (EBV) 
can be used to immortalize human lymphocytes or a huma n 
fusion partner can be used to produce human- human 
hybridomas. Primary in vitro immunization with peptides 

15 can also be used in the generation of human monoclonal 
antibodies . 

Antibodies secreted by the immortalized cells are 
screened to determine the clones that secrete antibodies of 
the desired specificity. For monoclonal anti -El 

20 antibodies, the antibodies must bind to HCV El protein. 
For monoclonal anti-idiotype antibodies, the antibodies 
must bind to anti-El protein antibodies. Cells producing 
antibodies of the desired specify are selected. 

The present invention also relates to the use of 

25 single- stranded antisense poly- or oligonucleotides derived 
from nucleotide sequences substantially homologous to those 
shown in SEQ ID NOs:l-51 to inhibit the expression of 
hepatitis C El genes. By substantially homologous as used 
throughout the specification and claims to describe the 

30 nucleic acid sequences of the present invention, is meant a 
level of homology between the nucleic acid sequence and the 
SEQ ID NOs. referred to in that sentence. Preferably, the 
level of homology is in excess of 80%, more preferably in 
excess of 90%, with a preferred nucleic acid sequence being 

35 in excess of 95% homologous with. the DNA sequence shown in 
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° the indicated SEQ ID NO. These anti- sense poly- or 

oligonucleotides can be either DNA or RNA. The targeted 
sequence is typically messenger RNA and more preferably, a 
single sequence required for processing or translation of 
the RNA. The anti -sense poly- or oligonucleotides can be 
5 conjugated to a polycation such as polylysine as disclosed 
in Lemaitre, M. et al. ((1989) Proc. Natl. Acad. Sci. USA 
84:648-652) and this conjugate can be administrated to a 
mammal in an amount sufficient to hybridize to and inhibit 
the function of the messenger RNA. 

10 The present invention further relates to multiple 

computer- generated alignments of the nucleotide and deduced 
amino acid sequences shown in SEQ ID NOs: 1-102. Computer 
analysis of the nucleotide sequences shown in SEQ ID NOs:l- 
51 and of the deduced amino acid sequences shown in SEQ ID 

15 NOs: 52 -102 can be carried out using commercially available 
computer programs known to one skilled in the art. 

In one embodiment, computer analysis of SEQ ID 
NOs: 1-51 by the program GENALIGN (Intelligenetics, Inc. 
Mountainview, CA) results in distribution of the 51 

20 sequences into twelve genotypes based upon the degree of 
variation of the sequences. For the purposes of the 
present invention, the nucleotide sequence identity of El 
cDNAs of HCV isolates of the same genotype is in the range 
of about 85% to about 100% whereas the identity of El cDNA 

25 sequences of different genotypes is in the range of about 
50% to about 80%. 

The grouping of SEQ ID NOs: 1-51 into twelve HCV 
genotypes is shown below. 

30 



35 
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SEP ID NOs: Genotypes 

1-8 I/la 

9-25 Il/lb 

26-29 III/2a 

30-33 IV/2b 

34 2c 

35-39 V/3a 

40 4a 

41 4b 
42-43 4C 
44 4d 
45-50 5a 
51 6a 



For those genotypes containing more than one El 
nucleotide sequence, computer alignment of the constituent 
nucleotide sequences of the genotype was conducted using 
GENALIGN in order to produce a consensus sequence for each 
genotype. These alignments and their resultant consensus 
sequences are shown in Figures 1A-G for the seven genotypes 
(I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) which 
comprise more than one nucleotide sequence. Further 
alignment of the consensus sequences of Figures 1A-G with 
SEQ ID N0:34 (genotype 2c), SEQ ID NO:40 {genotype 4a), SEQ 
ID NO: 41 (genotype 4b), SEQ ID NO: 44 (genotype 4d) and SEQ 
ID NO: 51 (genotype 6a) produces a consensus sequence for 
all twelve genotypes as shown in Figure 1H. The multiple 
alignments of nucleotide sequences shown in Figures 1A-H 
serve to highlight regions of homology and non- homology 
between different sequences and hence, can be used by one 
skilled in the art to design oligonucleotides useful as 
reagents in diagnostic assays for HCV. 

Examples of purified and isolated oligonucleotide 
sequences provided by the present invention are shown as 
30 seq ID NOs: 109-135 . The oligonucleotides shown in SEQ ID 
NOs: 109 -135 are useful as "genotype- specif ic" primers and 
probes since these oligonucleotides can hybridize 
specifically to the nucleotide sequence of the El gene of 
HCV isolates belonging to a single genotype. The genotype- 
specificity of the oligonucleotides shown in SEQ ID 



35 
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NOs: 109 -135 is as follows: SEQ ID NOs: 109 -110 are specific 
for genotype I/la; SEQ ID NOs: 111-112 are specific for 
genotype Il/lb; SEQ ID NOs : 113 - 114 are specific for 
genotype III/2a; SEQ ID NOs : 115 -116 are specific for 
genotype IV/2b; SEQ ID NOs: 117- 119 are specific for 
5 genotype 2c; SEQ ID NOs: 12 0-122 are specific for genotype 
V/3a; SEQ ID NOs:123-124 are specific for genotype 4a; SEQ 
ID NOs: 125 -125 are specific for genotype 4b; SEQ ID 
NOs:127-128 are specific for genotype 4c; SEQ ID NOs:129- 
130 are specific for genotype 4d; SEQ ID NOs:131-132 are 
10 specific for genotype 5a and SEQ ID NOs: 133 -135 are 
specific for genotype 6a, 

The oligonucleotides of this invention can be 
synthesized using any of the known methods of 
oligonucleotide synthesis (e.g., the phosphodiester method 

15 of Agarwal et al. 1972, Agnew. Chenu Int. Ed. Engl. 11:451, 
the phosphotriester method of Hsiung et al. 1979, Nucleic 
Acids Res 6:1371, or xhe automated diethylphosphoramidite 
method of Baeucage et al. 1981, Tetrahedron Letters 
22:1859-1862), or they can be isolated fragments of 

20 naturally occurring or cloned DNA. In addition, those 

skilled in the art would be aware that oligonucleotides can 
be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially custom ordered and 
prepared. In a preferred embodiment, SEQ ID NO: 103 through 

25 SEQ ID NO: 135 are synthetic oligonucleotides. 

The present invention also relates to a method 
for detecting the presence of HCV in a mammal, said method 
comprising analyzing the RNA of a mammal for the presence 
of hepatitis C virus. 

30 The RNA to be analyzed can be isolated from 

serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A) + RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of RNA 

35 by differential precipitation (Birnbiom, H.C. (198 8) 
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° Nucleic Acids Res., 16:1487-1497), extraction of RNA by 
organic solvents (Chomczynski, P. et al. (1987) Anal. 
Biochem., 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al. (1979) Biochemistry, 
18:5294-5299) . Poly (A) + RNA can be selected from whole cell 

5 RNA by affinity chromatography on oligo-d(T) columns (Aviv, 
H. et al. (1972) Proc. Natl- Acad. Sci., 69:1408-1412). A 
* preferred method of isolating RNA is extraction of viral 
RNA by the quanidium- phenol -chloroform method of Bukh et 
al. (1992a). 

10 The methods for analyzing the RNA for the 

presence of HCV include Northern blotting (Alwine, J.C. et 
al. (1977) Proc. Natl. Acad. Sci., 74:5350-5354), dot and 
slot hybridization (Kafatos, F.C. et al. (1979) Nucleic 
Acids Res., 7:1541-1522), filter hybridization (Hollander, 

15 M.C. et al. (1990) Biotechniques ; 9:174-179), RNase 
protection (Sambrook, J. et al. (1989) in "Molecular 
Cloning, A Laboratory Manual" , Cold Spring Harbor Press, 
Plainview, NY) and reverse -transcription polymerase chain 
reaction (RT-PCR) (Watson, J.D. et al. (1992) in 

20 "Recombinant DNA" Second Edition, W.H. Freeman and Company, 
New York) . A preferred method is RT-PCR. In this method, 
the RNA can be reverse transcribed to first strand cDNA 
using a primer or primers derived from the nucleotide 
sequences shown in SEQ ID N0s:l-51. A preferred primer for 

25 reverse transcription is that shown in SEQ ID NO: 104. Once 
the cDNAs are synthesized, PCR amplif ication is carried out 
using pairs of primers designed to hybridize with sequences 
in the HCV El cDNA which are an appropriate distance apart 
(at least about 50 nucleotides) to permit amplification of 

30 the cDNA and subsequent detection of the amplification 
product. Each primer of a pair is a single- stranded 
oligonucleotide of about 20 to about 60 bases in length 
where one primer (the "upstream" primer) is complementary 
to the original RNA and the second primer (the "downstream" 

35 primer) is complementary to the first strand of cDNA 
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generated by reverse transcriptions of the RNA. The target 
sequence is generally about 100 to about 300 base pairs 
long but can be as large as 500-1500 base pairs. 
Optimization of the amplif ication reaction to obtain 
sufficiently specific hybridization to the El nucleotide 
5 sequence is well within the skill in the art and is 

preferably achieved by adjusting the annealing temperature . 

In one embodiment, the primer pairs selected to 
amplify El cDNAs are universal primers. By "universal " , as 
used to describe primers throughout the claims and 

10 specification, is meant those primer pairs which can 
amplify El gene fragments derived from an HCV isolate 
belonging to any one of the twelve genotypes of HCV 
described herein. Purified and isolated universal primers 
are used in Example 1 of the present invention and are 

15 shown as SEQ ID NOs:103-108 where SEQ ID N0s:103 and 104 
represent one pair of primers, SEQ ID NOs:105 and 106 
represent a second pair of primers and SEQ ID N0s:107-108 
represent a third pair of, primers . 

In an alternative embodiment, primer pairs 

20 selected to amplify El cDNAs are genotype -specific primers. 
In the present invention, genotype -specific primer pairs 
can readily be derived from the following genotype- specif ic 
nucleotide domains: nucleotides 197-238 and 450-480 of the 
consensus sequence of genotype I/la shown in Figure 1A; 

25 nucleotides 197-238 and 450-480 of the consensus sequence 
of genotype Il/lb shown in Figure IB; nucleotides 199-238 
and 438-480 of the consensus sequence of genotype III/2a 
shown in Figure C; nucleotides 124-177 and 450-480 of the 
consensus sequence of genotype IV/2b shown in Figure ID; 

30 nucleotides 124-177, 193-238 and 436-480 of SEQ ID N0:34 

(genotype 2C) ; nucleotides 168-207, 294-339 and 406-480 of 
the consensus sequence of genotype V/3a shown in Figure 1E; 
nucleotides 145-183 and 439-480 of SEQ ID NO:40 (genotype 
4a); nucleotides 168-207 and 432-480 of SEQ ID NO:41 

35 (genotype 4b) ; nucleotides 130-183 and 450-480 of the 
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consensus sequence of genotype 4c shown in Figure IF; 
nucleotides 130-183 and 450-480 of SEQ ID N0:44 (genotype 
4d) ; nucleotides 166-208 and 437-480 of the consensus 
sequence of genotype 5a shown in Figure lb and nucleotides 
168-207, 216-252 and 429-480 of SEQ ID NO:51 (genotype 6a). 
5 One skilled in the art would readily appreciate that in a 
pair of genotype- specific primers, each primer is derived 
from different genotype- specific nucleotide domains 
indicated above for a given genotype. Also, as described 
earlier, it is understood by one skilled in the art that 

10 each pair of primers comprises one primer which is 

complementary to the original viral RNA and the other which 
is complementary to the first strand of cDNA generated by 
reverse transcription of the viral RNA. For example, in a 
pair of genotype -specific primers for genotype 4b, one 

15 primer would have a nucleotide sequence derived from region 
168-207 of SEQ ID NO: 40 and the other primer would have a 
nucleotide sequence which is the complement of region 432- 
480 of SEQ ID NO: 40. One skilled in the art would readily 
recognize that such genotype specific domains would also be 

20 useful in designing oligonucleotides for use as genotype - 
specific hybridization probes. Indeed, the sequences of 
such genotype- specific hybridization probes are disclosed 
later in the specification. 

The amplif icatioh products of PCR can be detected 

25 either directly or indirectly. In one embodiment, direct 
detection of the amplification products is carried out via 
labelling of primer pairs. Labels suitable for labelling 
the primers of the present invention are known to one 
skilled in the art and include radioactive labels, biotin, 

30 avidin, enzymes and fluorescent molecules. The derived 
labels can be incorporated into the primers prior to 
performing the amplification reaction. A preferred 
labelling procedure utilizes radiolabeled ATP and T4 
polynucleotide kinase (Sambrook, J. et al. (1989) in 

35 "Molecular Cloning, A Laboratory Manual", Cold Spring 
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° Harbor Press, Plainview, NY). Alternatively, the desired 
label can be incorporated into the primer extension 
products during the amplification reaction in the form of 
one or more labelled dNTPs. In the present invention, the 
labelled amplified PCR products can be detected by agarose 
5 gel electrophoresis followed by ethidum bromide staining 
and visualization under ultraviolet light or via direct 
sequencing of the PCR-products. 

In yet another embodiment, unlabelled 
amplification products can be detected via hybridization 

10 with labelled nucleic acid probes radioactively labelled 
or, labelled with biotin, in methods known to one skilled 
in the art such as dot and slot blot hybridization 
(Kafatos, F.C. et al. (1979) or filter hybridization 
(Hollander, M.C. et al. (1990)). 

15 In one embodiment, the nucleic acid sequences 

used as probes are selected from, and substantially 
homologous to, SEQ ID N0s:l-51. Such probes are useful as 
universal probes in that they can detect in PCR- 
amplif ication products of El cDNAs of an HCV isolate 

20 belonging to any of the twelve HCV genotypes disclosed 

herein. The size of these probes can range from about 200 
to about 500 nucleotides. 

In an alternative embodiment, the present 
invention relates to a method for determining the genotype 

25 of a hepatitis C virus present in a mammal where said 
method comprises: 

(a) amplifying RNA of a mammal via RT-PCR to 
produce amplif ication products; 

(b) contacting said products with at least one 
30 genotype- specif ic oligonucleotide; and 

(c) detecting complexes of said products which 
bind to said oligonucleotide (s) . 

In this method, one embodiment of said 
amplification step is carried out using the universal 
35 primers (SEQ ID NO: 103 through SEQ ID NO: 108) as disclosed 
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° above. In step (b) of this method, the nucleic acid 

sequences used as probes are substantially homologous to 
the sequences shown in SEQ ID NOs:l09-135. The probes 
disclosed in SEQ ID N0s:109-135 are useful in specifically 
detecting PGR- amplification products of El cDNAs of HCV 
5 isolates belonging to one of the twelve HCV genotypes 

disclosed herein. In a preferred embodiment, probes having 
sequences substantially homologous to the sequences shown 
in SEQ ID NOs:109-135 are used alone or in combination with 
other probes specific to the same genotype. 

10 For example, a probe having a sequence according 

to SEQ ID NO: 109 can be used alone or in combination with a 
probe having a sequence according to SEQ ID NO: 110- The 
probes derived from SEQ ID NOs:109-135 can range in size 
from about 30 to about 70 nucleotides and can be 

15 synthesized as described earlier. 

The nucleic acid sequence used as a probe to 
detect PGR amplification products of the present invention 
can be labeled in single -stranded or double- stranded form. 
Labelling of the nucleic acid sequence can be carried out 

20 by techniques known to one skilled in the art. Such 

labelling techniques can include radiolabels and enzymes 
(Sambrook, J. et al. (1989) in "Molecular Cloning, A 
Laboratory Manual Cold Spring Harbor Press, Plainview, 
New York) * In addition, there are known non- radioactive 

25 techniques for signal amplif ication including methods for 
attaching chemical moieties to pyrimidine and purine rings 
(Dale, R.N.K. et al. (1973) Proc. Natl. Acad, Sci. , 
70:2238-2242; Heck, R.F. (1968) S. Am. Chem. Soc. . 90:5518- 
5523) , methods which allow detection by chemiluminescence 

30 (Barton, S.K. et al. (1992) J. Am. Chem. Soc. . 114:8736- 
8740) and methods utilizing biotinylated nucleic acid 
probes (Johnson, T.K. et al. (1983) Anal . Biochem. , 
133:126-131; Erickson, P.P. et al. (1982) J. of Immunolocry 
Methods, 51:241-249; Matthaei, F.S. et al. (1986) Anal. 

35 Biochem. . 157:123-128) and methods which allow detection by 
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fluorescence using commercially available products. 

The present invention also relates to computer 
analysis of the amino acid sequences shown in SEQ ID 
N0s:52-102 by the program GENALIGN . This analysis groups 
the 51 amino acid sequences shown in SEQ ID NOs:52-102 into 
5 the twelve genotypes disclosed earlier in this application 
based upon the degree of variation of the amino acid 
sequences. For the purposes of the present invention, the 
amino acid sequence identity of El amino acid sequences of 
the same genotype ranges from about 85% to about 100% 
10 whereas the identity of El sequences of different genotypes 
ranges from about 45% to about 80%. 

The grouping of SEQ ID NOs:52-102 into the twelve 
HCV genotypes is shown below: 



15 SEQ ID NQs; Genotypes 

52-59 I/la 

60-76 Il/lb 

77-80 III/2a 

81-84 IV/2b 

85 2c 

86-90 V/3a 

20 9 1 4a 

92 4b 

93-94 4c 

95 4d 

96-101 5a 

102 6a 



25 For those genotypes containing more than one El 

amino acid sequence, computer alignment of the constituent 
sequences of each genotype was conducted using the computer 
program GENALIGN in order to produce a consensus sequence 
for each genotype. These alignments and their resultant 

30. consensus sequences are shown in Figures 2A-G for the seven 
genotypes (I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) 
which comprise more than one sequence. Further alignment 
of the consensus sequences shown in Figures 2A-G with the 
amino acid sequences of SEQ ID NO: 85 (genotype 2c); SEQ ID 

35 N0:91 (genotype 4a); SEQ ID N0:92 (genotype 4b); SEQ ID 
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0 NO: 95 (genotype 4d) and SEQ ID NO: 102 (genotype 6a) to 
produce a consensus amino acid sequence for all twelve 
genotypes is shown in Figure 2H* The multiple alignment of 
El amino acid sequences shown in Figures 2A-H serves to 
highlight regions of homology and non- homology between 
5 amino acid sequences and hence, these alignments can 
readily be used by one skilled in the art to derive 
peptides useful in assays and vaccines for the diagnosis 
and prevention of HCV infection. Examples of purified and 
isolated peptides are provided by the present invention are 

10 shown as SEQ ID NOs: 136-159- These peptides are derived 
from two regions of the amino acid sequences shown in 
Figures 2A-H, amino acids 48-80 and amino acids 138-160. 
The peptides shown in SEQ ID NOs:136-159 are useful as 
genotype -specific diagnostic reagents since they are 

15 capable of detecting an immune response specific to HCV 
isolates belonging to a single genotype. The genotype- 
specificity of the peptides shown in SEQ ID NOs:136-159 are 
as follows: SEQ ID NOs:136 and 148 are specific for 
genotype IV/ 2b; SEQ ID NOs:137 and 149 are specific for 

20 genotype 2c; SEQ ID NOs:138 and 150 are specific for 

genotype III/2a; SEQ ID NOs:139 and 151 are specific for 
genotype V/a; SEQ ID NOs:140 and 152 are specific for 
genotype Il/lb; SEQ ID NOs:141 and 153 are specific for 
genotype I/la; SEQ ID N0s:142 and 154 are specific for 

25 genotype 4a; SEQ ID NOs:143 and 155 are specific for 
genotype 4c; SEQ ID N0s:144 and 156 are specific for 
genotype 4d; SEQ ID NOs:145 and 157 are specific for 
genotype 4b; SEQ ID N0s:146 and 158 are specific for 
genotype 5a and SEQ ID N0s:147 and 159 are specific for 

30 genotype 6a. In SEQ ID NO: 13 6, Xaa at position 22 is a 

residue of Ala or Thr, Xaa at position 24 is a residue of 
Val or lie, Xaa at position 26 is a residue of Val or Met; 
in SEQ ID NO: 138 , Xaa at position 5 is a Ser or Thr 
residue, Xaa at position 11 is an Arg or Gin residue, Xaa 

35 at position 12 is an Arg or Gin residue; in SEQ ID NO: 139, 
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° Xaa at position 3 is a Pro or Ser residue, Xaa at position 
33 is a Leu or Met residue; in SEQ ID NO: 140 , Xaa. at 
position 5 is a Thr or Ala residue, Xaa at position 13 is a 
Gly, Ala, Ser, Val or Thr residue, Xaa at position 14 is a 
Ser, Thr or Asn residue, Xaa at position 15 is a Val or lie 
5 residue, Xaa at position 16 is a Pro or Ser residue, Xaa at 
position 18 is a Thr or Lys residue, Xaa at position 19 is 
a Thr or Ala residue, Xaa at position 22 is an Arg or His 
residue, Xaa at position 32 is an Ala, Val or Thr residue; 
in SEQ ID NO: 141, Xaa at position 3 is an Ala or Pro 

10 residue, Xaa at position 4 is a Val or Met residue, Xaa at 
position 5 is a Thr or Ala residue, Xaa at position 17 is a 
Thr or Ala residue, Xaa at position 18 is a Thr or Ala 
residue, Xaa at position 23 is a His or Tyr residue; in SEQ 
ID NO: 143, Xaa at position 10 is a Val or Ala residue, Xaa 

15 at position 11 is a Ser or Pro residue, Xaa at position 18 
is an Asp or Glu residue Xaa at position 20 is a Leu or He 
residue; in SEQ ID NO: 146, Xaa at position 3 is a Gin or 
His residue, Xaa at position 12 is an Asn, Ser or Thr 
residue, Xaa at position 13 is a Leu or Phe residue, Xaa at 

20 position 23 is an Ala or Val residue; in SEQ ID NO: 14 8, Xaa 
at position 16 is a Val or Ala residue, Xaa at position 18 
is a Glu or Gin residue; in SEQ ID NO:150, Xaa at position 
2 is an Ala or Thr residue, Xaa at position 4 is a Met or 
Leu residue, Xaa at position 9 is an Ala or Val residue, 

25 Xaa at position 17 is an He or Leu residue, Xaa at 

position 20 is an lie or Val residue, Xaa at position 21 is 
a Ser or Gly residue; in SEQ ID NO: 151, Xaa at position 9 
is a Val or He residue, Xaa at position 16 is a Leu or Val 
residue, Xaa at position 20 is an He or Leu residue; in 

30 SEQ ID NO: 152, Xaa at position 2 is an Ala or Thr residue, 
Xaa at position 6 is a Val or Leu residue, Xaa at position 
12 is an He or Leu residue, Xaa at position 16 is a Val or 
He residue, Xaa at position 17 is a Val, Leu or Met 
residue, Xaa at position 19 is a Met or Val residue, Xaa at 

35 position 21 is an Ala or Thr residue; in SEQ ID NO: 153, Xaa 



0 
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0 at position 2 is a Thr or Ala residue, Xaa at position 6 is 
a Val, lie or Met residue, Xaa at position 12 is an lie or 
Val residue, Xaa at position 16 is a lie or Val residue; in 
SEQ ID NO: 155, Xaa at position 5 is a Leu or Val residue, 
Xaa at position 21 is a Thr or Ala residue; in SEQ ID 

5 NO: 158, Xaa at position 1 is a Thr or Ala residue, Xaa at 
position 5 is a Val or Leu residue, Xaa at position 9 is a 
Leu, Met or Val residue, Xaa at position 23 is a Gly or Ala 
residue. 

Those skilled in the art would be aware that the 
10 peptides of the present invention or analogs thereof can be 
synthesized by automated instruments sold by a variety of 
manufacturers or can be commercially custom- ordered and 
prepared. The term analog has been described earlier in 
the specification and for purposes of describing the 
15 peptides of the present invention, analogs can further 

include branched or non- linear arrangements of the peptide 
sequences shown in SEQ ID NOs:136-159. 

Alternatively, peptides can be expressed from 
nucleic acid sequences where such sequences can be DNA, 
20 cDNA, RNA or any variant thereof which is capable of 
directing protein synthesis. In one embodiment, 
restriction digest fragments containing a coding sequence 
for a peptide can be inserted into a suitable expression 
vector that functions in prokaryotic or eukaryotic cells. 
25 Such restriction digest fragments may be obtained from 
clones isolated from prokaryotic or eukaryotic sources 
which encode the peptide sequence. 

Suitable expression vectors and methods of 
isolating clones encoding the peptide sequences of the 
30 present invention have previously been described. 

The preferred size of the peptides of the present 
invention is from about 8 to about' 100 amino acids in 
length . 

The present invention further relates to the use 
35 of the peptides shown in SEQ ID NOs:136-159 in methods of 
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detecting antibodies specific for HCV in biological 
samples. In one embodiment, at least one peptide specific 
for a single genotype to be used in previously described 
immunoassays to detect antibodies specific for a single 
genotype of HCV. A preferred immunoassay is ELISA. 
5 it is understood by one skilled in the art that 

the diagnostic assays described herein using genotype - 
specific oligonucleotides or genotype- specif ic peptides ca 
be useful in assisting one skilled in the art to choose a 
course of therapy for the HCV- infected individual. 
10 In an alternative embodiment, a mixture of 

peptides can be used in an immunoassay to detect antibodies 
to any of the twelve genotypes of HCV. The mixture of 
peptides as disclosed herein, comprises at least one 
peptide selected from SEQ ID NOs:140-141 and 152-153; one 

15 peptide selected from SEQ ID NOs:136, 138, 148 and 150; one 
peptide selected from SEQ ID NOs:142-145 and 154-157; one 
peptide selected from SEQ ID N0s:146 and 158; one peptide 
selected from SEQ ID NOs:139 and 151; one peptide selected 
from SEQ ID NOs:138 and 150 and one peptide selected from 

20 SEQ ID NOs:140 and 159. In a preferred embodiment, the 

peptides of the present invention can be used in an ELISA 
assay as described previously for El proteins. 

The peptides or analogs thereof may be prepared 
in the form of a kit, alone or in combinations with other 

25 reagents such as secondary antibodies, for use in 

immunoassay. In addition, since genotype -specific peptides 
shown in SEQ ID NOs:136-159 are derived from two variable 
regions in the El protein, amino acids 48-80 (SEQ ID 
N0s:136-147) and amino acids 138-160 (SEQ ID N0s:148-159) , 

30 one skilled in the art would recognize that these peptides 
would be useful as vaccines against hepatitis C. In the 
present invention, a peptide from SEQ ID NOs:136-159 can be 
used alone or in combination with other peptides shown 
therein as immunogens in the vaccine. Formulations 

35 suitable for administering the peptide (s) of the present 
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° invention, routes of administration, pharmaceutical 

compositions comprising the peptides and so forth are the 
same as those previously described for recombinant El 
proteins. In addition, as described for El proteins, the 
peptide (s) can also be used to prepare antibodies to HCV-E1 

5 protein. 

The peptides of the present invention may also be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above for El 
proteins recombinant. 
10 Any articles or patents referenced herein are 

incorporated by reference. The following examples 
illustrate various aspects of the invention but are in no 
way intended to limit the scope thereof. 

15 



20 



25 



30 



35 
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MATERIALS 

Serum used in these examples was obtained from 
84 anti-HCV positive individuals that were previously found 
to be positive for HCV RNA in a cDNA PCR assay with primer 
set a from the 5' NC region of the HCV genome (Bukh, J. et 
al. (1992 (b)) Natl. Acad. Sci. USA 89:4942-4946). These 
sanples were from 12 countries: Denmark (DK) ; Dominican 
Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India (IND); 
Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; Sweden 
(SW) ; Taiwan (T) ; United States (US); and Zaire (Z) . 



Example 1 

Identification of the DNA Sequence 
of the El Gene of 51 Isolates of HCV via 
RT-PCR Analysis of Viral RNA Using Universal Primers 

15 Viral RNA was extracted from 100 /il of serum by 

the guanidinium- phenol -chloroform method and the final RNA 
solution was divided into 10 equal aliquots and stored at 
-80°C as described (Bukh, et al. (1992 (a)). The sequences 
of the synthetic oligonucleotides used in the RT-PCR assay, 

20 deduced from the sequence of HCV strain H-77 (Ogata, N. et 
al. (1991) Proc. Natl. Acad. Sci. USA 88:3392-3396), are 
shown as SEQ ID NOs:103-108. One aliquot of the final RNA 
solution, equivalent to 10 pi of serum, was used for cDNA 
synthesis that was performed in a 20 /xl reaction mixture 

25 using avian myeloblastosis virus reverse transcriptase 

(Promega, Madison, WI) and SEQ ID NO: 104 as a primer. The 
resulting cDNA was amplified in a "nested" PCR assay by Tag 
DNA polymerase (Amplitaq, Perkin- Elmer/ Cetus) as described 
previously (Bukh et al. (1992a)) with primer set e (SEQ ID 

30 N0s:103-1G6) . Precautions were taken to avoid 

contamination with exogenous HCV nucleic acid (Bukh et al. 
1992a)), and negative controls (normal, uninfected serum) 
were interspersed between every test sample in both the RNA 
extraction and cDNA PCR procedures. No false positive 

35 results were observed in the analysis. In most instances, 
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0 amplified DNA (first or second PCR products) was 

reamplified with primers SEQ ID N0:107 and SEQ ID NO:108 
prior to sequencing since these two primers contained EcoRl 
sites which would facilitate future cloning of the El gene. 
Amplif ied DNA was purified by gel electrophoresis followed 
5 by glass -milk extraction (Geneclean, BIO 101, LaJolla, CA) 
and both strands were sequenced directly by the dideoxy- 
nucleotide chain termination method (Bachman, B. et al. 
(1990) Nucl. Acids Res. 18:1309)) with phage T7 DNA 
polymerase (Sequenase, United States Biochemicals, 

10 Cleveland; OH) , [alpha ^S] dATP (Amersham, Arlington 
Heights, IL) or [alpha 33 P] dATP {Amersham or DuPont, 
Wilmington, DE) and sequencing primers. RNA extracted from 
serum containing HCV strain H-77, previously sequenced by 
Ogata, N. et al. (1991), was amplified with primer set e 

15 (SEQ ID NOs:103-106) and sequenced in parallel as a 

control. The nucleotide sequences of the envelope 1 (El) 
gene of all 51 HCV isolates are shown as SEQ ID NOs:l - 51. 
In all 51 HCV isolates, the El gene was exactly 576 
nucleotides in length and did not have any in- frame stop 

20 codons . 

Example 2 

Computer Analysis of the Nucleotide 
and Deduced Amino Acid Sequences 
of the El Gene of the 51 HCV Isolates 

25 

Multiple computer- generated alignments of the 
nucleotide (SEQ ID NOs:l-5l f Figures 1A-H) and deduced 
amino acid sequences (SEQ ID N0s:52-102, Figures 2A-H) of 
the cDNAs of. the 51 HCV isolates constructed using the 
30 computer program GENALIGN (Miller, R.H. et al. (1990) Proc. 
Natl. Acad. Sci. USA 87:2057-2061) resulted in the 51 HCV 
isolates being divided into twelve genotypes based upon the 
degree of variation of the El gene sequence as shown in 
table 1. 



35 
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° The nucleotide and amino acid sequence identity 

of HCV isolates of the same genotype was in the range of 
88.0-99.1% and 89.1-98.4%, respectively, whereas that of 
HCV isolates of different genotypes was in the range of 
53.5-78.6% and 49.0-82.8%, respectively. The latter 
5 differences are similar to those found when comparing the 
envelope gene sequences of the various serotypes of the 
related flaviviruses, as well as other RNA viruses. When 
microheterogenicity in a sequence was observed, defined as 
more than one prominent nucleotide at a specific position, 

10 the nucleotide that was identical to that of the HCV 
prototype (HCV1, Choo et al. (1989)) was reported if 
possible. Alternatively, the nucleotide that was identical 
to the most closely related isolate is shown. 

Analysis of the consensus sequence of the El 

15 protein of the 51 HCV isolates from this study demonstrated 
that a total of 60 (30.3%) of the 192 amino acids of the El 
protein were invariant among these isolates (Fig. 3) . Most 
impressive, all 8 cysteine residues as well as 6 of 8 
proline residues were invariant. The most abundant amino 

20 acids (e.g. alanine, valine and leucine) showed a very low 
degree of conservation. The consensus sequence of the El 
protein contained 5 potential N-linked glycosylation sites. 
Three sites at positions 209, 305 and 325 were maintained 
in all 51 HCV isolates. A site at position 196 was 

25 maintained in all isolates except the sole isolate of 

genotype 2c. Also, a site at position 234 was maintained 
in all isolates except one isolate of genotype I/la, all 
four isolates of genotype IV/2b and the sole isolate of 
genotype 6a. Conversely, only genotype IV/2b isolates had 

30 a potential glycosylation site at position 233. Further 

analysis revealed a highly conserved amino acid domain (aa 
302-328) in the El protein with 20 (74.1%) of 27 amino 
acids invariant among all 51 HCV isolates. It is possible 
that the 5' and 3' ends of this domain are conserved due to 

35 important cysteine residues and N-linked glycosylation 
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° sites. The central sequence, 5 ' -GHRMAWDMM- 3 ' (aa 315-323), 
may be conserved due to additional functional constraints 
on the protein structure. Finally, although the amino acid 
sequence surrounding the putative El protein cleavage site 
was variable, an amino acid doublet (GV) at position 380 
5 was invariant among all HCV isolates. 

A dendrogram of the genetic r elatedness of the El 
protein of selected HCV isolates representing the 12 
genotypes is shown in Fig. 4. This dendrogram was 
constructed using the program CLUSTAL (Weiner, A. J. et al. 

10 (1991) Virology 180:842-848) and had a limit of 25 

sequences- The scale showing percent identity was added 
based upon manual calculation. From the 51 HCV isolates 
for which the complete sequence of the El gene region was 
obtained, 25 isolates representing the twelve genotypes 

15 were selected for analysis as follows. Among isolates with 
genotype I/la (SEQ ID NOs:52-59), as well as among isolates 
with genotype Il/lb (SEQ ID NOs:60-76) the two isolates 
with the lowest amino acid identity within each genotype 
were included. Among isolates of genotype IV/2b, isolate 

20 DK8 (SEQ ID NO: 81) that has an amino acid identity of 96.4% 
to isolate T8 (SEQ ID NO: 84) was excluded. Among isolates 
of genotype V/3a, isolates S2 (SEQ ID NO: 88) and S54 (SEQ 
ID NO: 90) that both shared 97.9 % of the amino acids of 
isolates HK10 (SEQ ID NO:87) and S52 (SEQ ID NO:89) were 

25 excluded. Finally, among isolates of genotype VI, isolates 
SA4 (SEQ ID NO:97) and SA5 (SEQ ID N0:98) with an amino 
acid identity to isolate SA7 (SEQ ID NO: 100) of 96.4% and 
95.8%,. respectively were excluded. This dendrogram in 
combination with the analysis of the El gene sequence of 51 

30 HCV isolates in Table 1 demonstrates extensive 
heterogeneity of this important gene. 

The worldwide distribution of the 12 genotypes 
among 74 HCV isolates is depicted in Fig. 5. The complete 
El gene sequence was determined in 51 of these HCV isolates 

35 (SEQ ID NOs:l-51), including 8 isolates of genotype I/la, 
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0 17 isolates of genotype Il/lb and 26 isolates comprising 
genotypes III/2a, IV/2b, 2c, 3a, 4a-4d, 5a and 6a* In the 
remaining 23 isolates, all of genotypes I/la and Il/lb, the 
genotype assignment was based on a partial El gene sequence 
since they did not represent additional genotypes in any of 
5 the 12 countries > The number of isolates of a particular 
genotype is given in each of the 12 countries studied. Of 
the twelve genotypes, genotypes I/la and Il/lb were the 
most common accounting for 48 (65%) of the 74 isolates. 
Analysis of the El gene sequences available in the GenBank 

10 data base at the time of this study revealed that all 44 
such sequences were of genotypes I/la, Il/lb, III/2a and 
IV/2b. Thus, based upon El gene analysis, 8 new genotypes 
of HCV have been identified. 

Also of interest, different HCV genotypes were 

15 frequently found in the same country, with the highest 

number of genotypes (five) being detected in Denmark. Of 
the twelve genotypes, genotypes I/la, Il/lb, III/2a, IV/ 2b 
and V/3a were widely distributed with genotype II/ lb being 
identified in 11 of 12 countries studied (Zaire was the 

20 only exception) . In addition, while genotypes I/la and 
Il/lb were predominant in the Americas, Europe and Asia, 
several new genotypes were predominant in Africa. 

It was also found that genotypes I/la, Il/lb, 
III/2a, IV/2b and V/3a of HCV were widely distributed 

25 around the world, whereas genotypes 2c, 4a, 4b, 4d, 5a and 
6a were identified only in discreet geographical regions. 
For example, the majority of isolates in South Africa 
comprised a new genotype (5a) and all isolates in Zaire 
comprised 3 new closely related genotypes (4a, 4b, 4c) . 

30 These genotypes were not identified outside Africa. 



Example 3 

Detection by ELISA Based on Antigen from 
Insect Cells Expressing Complete El Protein 

35 Expression of El protein in SF9 cells . A cDNA 
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(SEQ ID NO:l) encoding the complete El protein of SEQ ID 
NO: 52 is subcloned into pBlueBac - Transfer vector 
(Invitrogen) using standard subcloning procedures. The 
resultant recombinant expression vector is cotransf ected 
into SF9 insect cells (Invitrogen) by the Ca precipitation 
5 method according to the Invitrogen protocol. 

ELISA Based on Infected SF9 cells . 5 x 10 6 SF9 
cells infected with the above- described recombinant 
expression vector are resuspended in 1 ml of 10 mM Tris- 
HC1, pH 7.5, 0.15M NaCl and are then frozen and thawed 3 

10 times. 10 ul of this suspension is dissolved in 10 ml of 
carbonate buffer (pH 9.6) and used to cover one flexible 
microtiter assay plate (Falcon) . Serum samples are diluted 
1:20, 1:400 and 1:8000, or 1:100, 1:1000 and 1:10000. 
Blocking and washing solutions for use in the ELISA assay 

15 are PBS containing 10% fetal calf serum and 0.5% gelatin 
(blocking solution) and PBS with 0.05% Tween -20 (Sigma, 
St. Louis, MO) (washing solution). As a secondary antibody, 
peroxidase -conjugated goat IgG fraction to human igG or 
horse radish peroxidase -labelled goat anti-Old or anti-New 

20 World monkey immunoglobulin is used. The results are 

determined by measuring the optical density (O.D.) at 405 
nm. 

To determine if insect cells -derived El protein 
representing genotype I/a of HCV could detect anti-HCV 
25 antibody in chimpanzees infected with genotype I/la of HCV, 
three infected chimpanzees are examined. The serum of all 
3 chimpanzees are found to seroconvert to anti-HCV. 



Example. 4 

30 Use of the Complete 

El Protein as a Vaccine 

Mammals are immunized with purified or partially 
purified El protein in an amount sufficient to stimulate 
the production of protective antibodies. The immunized 
35 mammals challenged with various genotypes of HCV are 
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0 protected. 

It is understood by one skilled in the art that 
the recombinant El protein used in the above vaccine can 
also be used in combination with other recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
5 NOs:52-102. 

Example 5 

Determination of the Genotype of an HCV 
Isolate Via Hybridization of Genotype- Specific 
Oligonucleotides to RT-PCR Amplification Products. 

10 

Viral RNA is isolated from serum obtained from a 
mammal and is subjected to RT-PCR as in Example 1. 
Following amplification, the amplified DNA is purified as 
described in Example 1 and aliquot s of 100 mg of 

j2 amplification product are applied to twelve dots on a 

nitrocellulose filter set in a dot blot apparatus. The 
twelve dots are then cut into separate dots and each dot is 
hybridized to a 32 p-labelled oligonucleotide specific for a 
single genotype of HCV. The oligonucleotides to be used as 

2 q hybridization probes are selected from SEQ ID N0s:109-135. 

Example 6 

ELISA Based on Synthetic 
Peptides Derived From El cDNA Sequences 

25 Synthetic peptides specific for genotype I/la and 

having amino acid sequences according to SEQ ID NOs:136-148 
are placed in 0.1% PBS buffer and 50ul of img/ml of peptide 
is used to cover each well of the microtiter assay plate. 
Serum samples from two mammals infected with genotype I/la 

3Q HCV and from one mammal infected with genotype 5a HCV are 
diluted as in Example 3 and the ELISA is carried out as in 
Example 3* Both mammals infected with genotype I HCV react 
positively with peptides while the mammal infected with 
genotype 5a HCV exhibit no reactivity. 



35 
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Example 7 
Use of the El Peptides as a Vaccine 

Since the El genotype- specif ic peptides of the 
present invention are derived from two variable regions in 
the complete El protein, there exists support for the use 
of these peptides as a vaccine to protect against a variety 
of HCV genotypes. Mammals axe immunized with peptide (s) 
selected from SEQ ID NOs: 136-159 in an amount sufficient 
to stimulate production of protective antibodies. The 
immunized mammals challenged with various genotypes of HCV 
are protected. 



15 



20 



25 



30 



35 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
„ (D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homo sapiens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



TAC CAA GTG CGC AAC TCC ACG GGG CTT TAC CAT GTC ACC 39 

5 AAT GAT TGC CCT AAC TCG AGT ATC GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAC ACT CCG GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GTC TCG AGG TGT TGG GTG GCG ATG ACC 156 

CCC ACG GTG GCC ACC AGG GAT GGC AAA CTC CCC ACA GCG 195 

GAG CTT CGA CGT CAC ATC GAT CTG CTC GTC GGG AGT GCC 234 

ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC CTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAA CTG TTT ACC TTC TCT CCC 312 

AGG CGC CAC TGG ACG ACG CAA GGC TGC AAT TGT TCT ATC 351 

10 TAT CCT GGC CAT ATA ACG GGT CAC CGC ATG GCG TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACC ACG GCG TTG GTA GTA 429 

GCT CAG CTG CTC CGG ATC CCG CAA GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTG GCG GGC ATA GCG 507 

TAT TTT TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA 546 

GTG CTG CTG CTA TTT GCC GGC GTC GAC GCG 576 



LJ (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 



~- TAC CAA GTA CGC AAC TCC TCG GGC CTC TAC CAT GTC ACC 39 

ZD AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAT TCT CCA GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GCC TCG AAA TGT TGG GTG GCG GTG GCC 156 

CCC ACG GTG GCC ACC AGG GAC GGC AAG CTC CCC GCA ACG 195 

CAG CTT CGA CGT CAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

ACC CTC TGC TCG GCC CTC TAT GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTC CTT GTC GGC CAA CTG TTC ACC TTC TCC CCC 312 

30 AGA CGC CAC TGG ACA ACG CAA GAC TGC AAC TGT TCT ATC 351 

TAC CCC GGC CAT ATT ACG GGT CAT CGC ATG GCG TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACA GCA GCG CTG GTA ATG 429 

GCG CAG CTG CTC AGG ATC CCG CAG GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCC CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 

TAT TTC TCC ATG GTG GGG AAC TGG GCG AAG GTC GTG GTG 546 

GTA CTG TTG CTG TTT ACC GGC GTC GAT GCG 576 



35 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


GCG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTT 


TCT 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA 


ATG 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


N0:4: 




CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAG 


CTC 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


TCT 


CCC 


312 


AGG 


CAC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 
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TAT 


CCC 


GGC 


CAT 


ATA ACG 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT ACG 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

10 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



15 



TAC 


CAA 


GTG 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTT 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACA 


GCT 


78 


GAT 


GCT 


ATC 


CTA 


CAC 


GCT 


CCG 


GGA 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGT 


GAG 


GGT 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


TAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAG 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


AGG 


CGC 


CTC 


TGG 


ACG 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAT 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACG 


GCA 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAT 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCG 


AAG 


GTC 


CTA 


GTG 


546 


GTG 


CTG 


CTG 


CTA 


TTC 


GCC 


GGC 


GTT 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
30 (D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



35 



TAC CAA GTA CGC AAC TCC ACG GGC CTT TAC CAT GTC ACC 



39 
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AAT 


GAC 


TGC 


CCT 


AAC 


TCG 


AGC 


■ 54 
ATT 


- 

GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


ACC 


ATC 


CTA 


CAC 


TCT 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGA 


TGT 


TGG 


GTG 


CCG 


GTG 


GCC 


156 


CCC 


ACA 


GTT 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTT 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


AGC 


CAG 


CTG 


TTC 


ACT 


ATC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 


TAC 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


ACG 


GCG 


TTG 


GTA 


ATA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


GTC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


CTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTC 


GAT 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SW1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



25 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


GCC 


ATT 


CTA 


CAC 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GAT 


GGC 


GCC 


CCG 


AAG 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


156 


CCC 


ACA 


GTC 


GCC 


ACT 


AGG 


GAC 


GGC 


AAA 


CTC 


CCT 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGA 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTC 


GTC 


AGT 


CAA 


CTG 


TTC 


ACG 


TTC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGT 


AAC 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCC 


ACA 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


ATA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


TCC 


GGC 


GTC 


GAT 


GCG 
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30 

(2) INFORMATION FOR SEQ ID NO: 8: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



35 



(i) 



yVO 95/01442 



PCT/USM/07320 



- 55 - 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



5 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


GGG 


TGT 


GTT 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCT 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA ACG 


195 


CAA 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAA 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


AGA 


CGC 


CAC 


TGG 


ACG 


ACG 


CAG 


GGC 


TGC 


AAT 


TGT 


TCT ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


GCG 


GCG 


TTG 


GTG 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 


GTC 


GAC 


GCG 
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15 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCG 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GGC 


AAC 


GTC 


CCC 


ACT 


ACG 


195 


GCG 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


CTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACG 


GTA 


CAG 


GAG 


TGT 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


TTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTT 


GAC 


GGC 
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35 



(2) INFORMATION FOR SEQ ID NO: 10: 



.W0 95/HJ442 



PCT/US94/07320 



- 56 - 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAA GTC ACC 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


TCG 


AGC 


ATC 


GTG 


TAT 


GAG 


ACA GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC GTT 


117 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


AGC 


AGC 


GTC 


CCC 


ACT ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAA 


TGT 


AAC 


TGC 


TCA ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTC GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTC 


GAC 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 11: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: DKl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAC 


GTC 


ACA 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAG 


GCA 


GTG 


78 


GAC 


GTG 


ATC 


ATG 


CAT 


ACC 


CCA 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


CAC 


TCC 


CGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


ATC 


CCC 


ACT ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCC 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTT 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GCA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTT 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA ACA 


GCC 


CTA 


GTG 


CTA 


429 



WO 95/01442 



PCT/US94/07320 



TCG CAG TTA CTC CGA 
GTG GCG GGG GCC CAC 
TAC TAC TCC ATG GCG 
GTG TTG CTA CTC TTT 



- 57 - 

ATC CCA CAA GCT GTC 
TGG GGA GTC CTG GCG 
GGG AAC TGG GCC AAG 
GCC GGC GTT GAT GGG 



GTG GAC ATG 468 
GGC CTC GCC 507 
GTT TTA ATT 546 
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(2) INFORMATION FOR SEQ ID NO: 12: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE : nucleic acid 

( C) STRANDEDNESS : single 
{ D ) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


GTC 


GTG 


TAT 


GAG 


ACA 


GCA 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


GTG 


CCC 


TGC 


GTA 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGT 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GTC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCC 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


CTC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GGA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAA 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 








576 



^ (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



CAT GAA GTG CAC AAC GTA TCC GGG ATC TAC CAT GTC ACG 39 
AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 
GAC ATG ATC ATG CAT ACC CCC GGG TGC GTG CCC TGC GTC 117 



WO 95/01442 



PCT7US94/07320 



5 



fpp 


pup 




■rap 


ILL 


1 LL 




D O 
IbL 


TP_P 
IVjLt 


blA 


nrrz 

LrLVjr 


prpp 
LXL 


7\ pnp 


o 


r»OP 


ALLr 


GTC 


ppp 


GLL 




AAL 


ppn 
LrLL 


T\PP 


AIL, 


ppp 
LLL 


7A P"P 
ALI 


ALb 


IOC 


AuA 


ATA 


pp* 

LGA 


LLiL 


P7\T* 

LIAi 


biL 


bAL 


TTG 


CTC 


GTT 




ppp 


ppfp 
oLl 


A 


GCT 


TTC 


TGL 


TLL 


ppp 

GLL 


ATP 

Alb 


*T»A P 


pmp 

GrG 


GGA 




CTC 


mpp 


pp 71 


^ / J 


TCT 


GTC 


TTC 


CTC 


GTL 


tcl 


LAG 


TTG 


TTC 


ALU 


TTC 


mpp 


CCT 




CGC 


CGG 


GAT 


GAG 


AUG 


G1A 


PTVP 

LAL 


GAL 


I\jL 


TV 7\ rn 

AA1 


mpp 


1 LA 


Al L 




TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGA 


CTC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCT 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCC 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 14: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

1S (vi) ORIGINAL SOURCE: 

D (A) ORGANISM: hamosapiens 

(C) INDIVIDUAL ISOLATE : HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TTA 


AGC 


ATC 


GTG 


TAC 


GAG 


ACA 


ACG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAA 


AAC 


AAC 


TCC 


TCC 


CGT 


TGT 


TGG 


GTA 


GCG 


CTC 


GCC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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30 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 



35 



ORIGINAL SOURCE: 



WO 95/01442 PCT/US94/07320 
* • 



- 59 - 

(A) ORGANISM:, homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



5 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAA 


ACA 


GCG 


78 


GAC 


ATG 


ATT 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


ATG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GTC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTT 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGC 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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15 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57S base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


TCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CAC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTA 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA, GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



35 



(2) INFORMATION FOR SEQ ID NO: 17: 



WO 95701442 



PCT/US94/07320 



- 60 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TAT 


GAG 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TTC 


TCT 


AGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCT 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCG 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 18: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 



35 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 18: 




TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATA 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACA 


CTC 


GCG 


GCT 


AGG 


AAT 


TCC 


AGC 


GTC 


CCA 


ACT 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


CTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


TGG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCT 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 



WO 95/01442 



PCT/US94/07320 



- 61 - 



TCG CAG CTA CTC CGG ATC CCA CAA GCT ATC TTG GAT GTG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTC TTG ATT 546 

GTG ATG CTA CTC TTT GCC GGC GTT GAC GGA 576 



(2) INFORMATION FOR SBQ ID NO: 19: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANOEDNES S : s ingle 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTA 


TCC 


GGG 


GCG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAC 


GAG 


GCA 


GCG 


78 


GAC 


GTG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGT 


GTA 


CCC 


TGC 


GTT 


117 


CAG 


GAG 


GGT 


AAC 


TCC 


TCC 


CAA 


TGC 


TGG 


GTG 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


ACC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GTT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


ATC 


TCG 


CCC 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


AAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGA 


CAC 


GTG 


ACA 


GGT 


CAT 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


CTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCT 


GGT 


GTT 


GAC 


GGG 








576 



^ (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TAT GAA GTG CGC AAC GTG TCC GGG GCG TAC CAT GTC ACG 39 
AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GTG 78 
35 GAC GTG ATC CTG CAC ACC CCT GGG TGC GTG CCC TGC GTT 117 



- WO 95/01442 



PCT/US94/07320 



5 



CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


- 62 
TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTT 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCA ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


TTA 


GTG 


GTA 


429 


TCG 


GAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAG 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


CTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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( 2 ) INFORMATION FOR SEQ ID NO : 2 1 : 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


TAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CGC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


CTA 


GTA 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


•CGG 


ATC 


CCA 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTT 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



30 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 



FCT7US94/07320 



- 63 - 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAT CAT GTC ACG 39 

AAC GAC TGT TCC AAC TCA AGC ATT GTG TAT GAG ACA GCG 78 

5 GAC ATG ATC ATG CAT ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG GCC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 156 

CCC ACG CTA GCA GCC AGG AAC ACC AGC GTC CCC ACT ACG 195 

ACA ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GTT ATG TAC GTG GGG GAT CTC TGC GGA 273 

TCT GIT TTC CTC GTC TCC CAG CTG TTC ACT TTT TCA CCT 312 

CGC CGG CAC GAG ACA GTA CAG GAC TGC AAC TGT TCC ATC 351 

TAT CCC GGC CAC GTA TCA GGT CAC CGC ATG GCT TGG GAC 390 

10 ATG ATG ATG AAC TGG TCA CCT ACA GCA GCC CTG GTG GTA 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTA GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCA 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTC TTT GCT GGC GTT GAC GGG 576 



15 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



25 



TAC 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


TAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AGC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTT 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACT 


AAG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACG 


GCA 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


CTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 24: 



•WO 95/01442 



PCT/US94/07320 



- 64 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TTT 


GAG 


GCA 


GCG 


78 


GAC 


TTG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACG 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAT 


GTG 


GGA 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACT 


TTG 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


CTG 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCT 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


ACA 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTA 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 25: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

9 (vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



35 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


ACT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGG 


273 


TCC 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CAG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 



WO 95/01442 
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- 65 - 



TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC ATG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT CTG ATT 546 

GTG TTG CTA CTC TTT GCC GGC GTT GAC GGG 576 



(2) INFORMATION FOR SEQ ID NO: 26: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T2 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



15 



GCC 


CAA 


GTG 


AGG 


AAC 


ACC 


AGC 


CGC 


GGT 


TAC 


ATG 


GTG 


ACT 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


GAG 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


CAA 


78 


GCC 


GCG 


GTT 


CTC 


CAC 


GTC 


CCC 


GGG 


TGT 


ATC 


CCG 


TGT 


GAG 


117 


AGG 


CTG 


GGA 


AAT 


ACA 


TCC 


CGA 


TGC 


TGG 


ATA 


CCG 


GTC 


ACA 


156 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCT 


CTT 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


TTT 


GTG 


CAA 


GAA 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


GGC 


GGG 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTT 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


ATT 


GTC 


546 


ATC 


CTC 


TTG 


CTG 


GCT 


GCT 


GGG 


GTG 


GAC 


GCG 








576 



22 (2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



35 



GCA CAA GTG AAG AAC ACC ACT AAC AGC TAC ATG GTG ACC 
AAC GAC TGT TCT AAT GAC AGC ATC ACT TGG CAG CTC CAG 
GCC GCG GTC CTC CAC GTC CCC GGG TGT GTC CCG TGC GAG 



39 
78 
117 



WO 95/01442 



PCT/OS94/07320 



5 



AAA AC6 


GGA 


AAT 


ACA 


TCT 


CGG 


- 66 
TGC 


_ 

TGG 


ATA 


CCG 


GTT 


TCA 


156 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATT 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


6G6 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATC 


GTC 


TCG 


CCG 


312 


CAA 


CAT 


CAC 


TGG 


TTT 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATC 


351 


TAC 


CCT 


GGC 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


TTA 


GAC 


ATC 


468 


GTT 


AGC 


GGG 


GCA 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTG 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 28: 

10 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



GCC 


GAA 


GTG 


AAG 


AAC 


ACC 


AGT 


ACC 


AGC 


TAC 


ATG 


GTG 


ACA 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


GAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


CAG 


78 


GCC 


GCG 


GTC 


CTC 


CAC 


GTC 


CCC 


GGG 


TGC 


GTC 


CCG 


TGC 


GAG 


117 


AGA 


GTT 


GGA 


AAC 


GCG 


TCG 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCG 


156 


CCA 


AAC 


GTA 


GCT 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTC 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGC 


273 


GGG 


GTA 


ATG 


CTC 


GCC 


GCT 


CAG 


ATG 


TTC 


ATT 


ATC 


TCG 


CCG 


312 


CAG 


CAC 


CAC 


TGG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


CCT 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACC 


ACC 


ATG 


ATC 


TTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


AGC 


GGA 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


CTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


GTT 


GTC 


546 


ATC 


CTG 


TTG 


CTC 


ACC 


GCT 


GGC 


GTG 


GAC 


GCG 
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30 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



35 



ORIGINAL SOURCE: 



WO 95/01,442 



PCT/US94/07320 



- 67 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



5 



GTC 


CAA 


GTG 


AAA 


AAC 


ACC 


AGT 


ACC 


AGC 


TAT 


ATG 


GTG 


ACC 


39 


AAT 


GAC 


TGC 


TCC 


AAC 


GAC 


AGC 


ATC 


ACT 


TGG 


CAA 


CTT 


GAG 


78 


GCT 


GCG 


GTC 


CTC 


CAC 


GTT 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


117 


AAA 


GTG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCA 


156 


CCA 


AAT 


GTG 


GCC 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACT 


CAC 


ATC 


GAC 


ATG 


GTC 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


TTC 


TGC 


GGT 


273 


GGG 


ATG 


ATG 


CTC 


GCA 


GCC 


CAA 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGC 


CAC 


CAC 


TCG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAG 


CCC 


GGT 


ACC 


ATC 


ACC 


GGG 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACT 


TTG 


ATC 


CTG 


429 


GCG 


TAC 


GTG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATT 


AGC 


GGG 


GCG 


CAT 


TGG 


GGC 


GTC 


TTG 


TTC 


GGC 


TTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTA 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 30: 

ij 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 0: 



25 



GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCC 


AGC 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


GAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CGC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTT 


ACT 


CAT 


195 


AAC 


CTG 


CGA 


ACA 


CAC 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTA 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


CTC 


ATA 


ATA 


TCG 


CCT 


312 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAT 


GCC 


GCT 


CGT 


GTT 


CCT 


GAG 


CTA 


GCC 


CTC 


CAG 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTT 


CTT 


GTC 


GCA 


GGA 


GTG 


GAT 


GCA 
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35 



(2) INFORMATION FOR SEQ ID NO: 31: 



WO 95/C1442 



PCTAJS94/07320 



- 68 - 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 



GTG 


GAA 


GTC 


AGG 


AAC 


ACC 


AGT 


TCT 


AGT 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAT 


ATA 


GAT 


ATG 


ATT 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


GTA 


TCG 


CCA 


312 


GAA 


CAC 


CAC 


CAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAC 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTT 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAT 


GCC 


GCC 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTC 


468 


GTC 


TTC 


GGT 


GGT 


CAT 


TGG 


GGT 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAG 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTT 


CTT 


GTA 


GCA 


GGA 


GTG 


GAT 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 32: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:32: 




GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCT 


AGC 


TAC 


TAT 


GCC ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AGC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTC 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCG 


TGT 


GAG 


117 


AAT 


GAT 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCG 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAC 


GTC 


GAT 


ATG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


ATG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CGT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTT 


429 



.WO 95/01442 



PCT/US94/07320 



GCC TAT GCC GCT CGT 
GTC TTC GGC GGC CAT 
TAT TTC TCC ATG CAA 
ATC CTC CTG CTT GTC 



- 69 - 

GTT CCT GAG CTA GTC 
TGG GGC GTG GTG TTT 
GGA GCG TGG GCC AAG 
GCA GGA GTG GAT GCA 



CTT GAA GTT 468 
GGC TTG GCC 507 
GTC ATT GCC 546 

576 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



15 



GTG 


GAA 


GTT 


AGA 


AAC 


ACC 


AGT 


TTT 


AGC 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCG 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


ACC 


78 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


TTG 


CGC 


TGC 


TGG 


ATA 


CAA 


GTA 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGT 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


ACG 


CAT 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGG 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATA 


GCG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTC 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTG 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAC 


GCT 


GCT 


CGT 


GTG 


CCT 


GAA 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATC 


GCC 


546 


ATC 


CTC 


CTC 


CTT 


GTC 


GCA 


GGA 


GTG 


GAC 


GCA 
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25 (2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GTG GAG GTC AAG GAC ACC GGC GAC TCC TAC ATG COG" ACC 39 

AAC GAT TGC TCC AAC TCT AGT ATC GTT TGG CAG CTT GAA 78 

GGA GCA GTG CTT CAT ACT CCT GGA TGC GTC CCT TGT GAG 117 



WO 95/01.442 



PCT/US94/07320 



5 



CGT 


ACC 


GCC 


AAC 


GTC 


TCT 


CGA 


■ 70 
TGT 


- 

TGG 


GTG 


CCG 


GTT 


1 

GCC 


156 


CCC 


AAT 


CTC 


GCC 


ATA 


AGT 


CAA 


CCT 


GGC 


GCT 


CTC 


ACT 


AAG 


195 


GGC 


CTG 


CGA 


GCA 


CAC 


ATC 


GAT 


ATC 


ATC 


GTG 


ATG 


TCT 


GCT 


234 


ACG 


GTC 


TGT 


TCT 


GCC 


CTT 


TAT 


GTG 


GGG 


GAC 


GTG 


TGT 


GGC 


273 


GCG 


CTG 


ATG 


CTG 


GCC 


GCT 


CAG 


GTC 


GTC 


GTC 


GTG 


TCG 


CCA 


312 


CAA 


CAC 


CAT 


ACG 


TTT 


GTC 


CAG 


GAA 


TGC 


AAC 


TGT 


TCC 


ATA 


351 


TAC 


CCG 


GGC 


CGC 


ATT 


ACG 


GGA 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACT 


ACC 


ACC 


ATG 


CTC 


CTG 


429 


GCG 


TAC 


TTG 


GTG 


CGC 


ATC 


CCG 


GAA 


GTC 


ATC 


TTG 


GAT 


ATT 


468 


GTT 


ACA 


GGA 


GGT 


CAT 


TGG 


GGT 


GTA 


ATG 


TTT 


GGC 


CTC 


GCT 


507 


TAC 


TTC 


TCC 


ATG 


CAG 


GGA 


TCG 


TGG 


GCG 


AAG 


GTC 


ATC 


GTT 


546 


ATC 


CTC 


CTG 


CTG 


ACT 


GCT 


GGG 


GTG 


GAG 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 35: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK12 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TTA GAG TGG CGG AAT GTG TCC GGC CTC TAC GTC CTT ACC 39 

AAC GAC TGT TCC AAT AGC AGT ATC GTG TAT GAG GCC GAT 78 

20 GAC GTC ATT CTG CAC ACA CCT GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCT ACG TGC TGG ACC TCA GTG ACG 156 

CCT ACA GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTG CTA GTG GGC GCG GCC 234 

ACG ATG TGC TCT GCG CTC TAC GTG GGT GAT GTG TGT GGG 273 

GCC GTC TTC CTT GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACA GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

25 ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTA 429 

GCG CAC GTC CTG CGT CTG CCC CAG ACC TTG TTC GAC ATA 468 

ATA GCT GGG GCC CAT TGG GGC ATC ATG GCG GGC CTA GCC 507 

TAT TAC TCC ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG GTT ATG TTT TCA GGA GTC GAT GCC 576 



30 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 



PCTAUS94/07320 



- 71 - 

(A) ORGANISM: hoxnosapiens 
(C) INDIVIDUAL ISOLATE: HK10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



5 



CTA 


GAG 


TGG 


CGG 


AAT 


GTG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


CCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


TCG 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCC 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGC 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTC 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC ACG 


TTC 


AGA 


CCG 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAC 


CTT 


TCA 


GGA 


CAT 


CGA ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCC 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTC 


CTG 


CGG 


TTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


468 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCA 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 








576 



15 (2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



25 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTC 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTT 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGT 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAT 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTG 


GTG 


GGC 


GCG 


GCC 


234 


ACT 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTT 


CTG 


CGT 


TTG 


CCC 


CAG 


ACC 


GTG 


TTC 


GAC 


ATA 


468 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAA 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAC 


GCC 








576 



35 



(2) INFORMATION FOR SEQ ID NO:38: 



.WO 9S/0JL442 



PCT/US94/07320 



- 72 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S52 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ATG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


GTT 


TCA 


GGA 


CAT 


CGA ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


ATC 


CTG 


CGA 


TTG 


CCC 


CAG 


ACC 


TTG 


TTT 


GAC 


ATA 


468 


CTG 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAT 


TCT 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATT 


546 


GTC 


ATG 


ATT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 39: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

7 (vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S54 



35 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:39: 




CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


ATC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGA ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 



• WO 95/01442 



FCT/US94/07320 



- 73 - 

GCG CAC ATC CTG CGA TTG CCC GAG ACC TTG TTT GAC ATA 468 

CTG GCC GGG GCC CAT TGG GGC ATC TTG GCG GGC CTA GCC 507 

TAT TAT TCT ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG ATT ATG TTT TCA GGG GTC GAT GCC 576 



(.2) INFORMATION FOR SEQ ID NO: 40: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z4 



15 



20 





(xi) 


SEQUENCE 


DESCRIPTION : 


: SEQ ID 


NO: 40: 




GAG 


CAC 


TAC 


CGG 


AAT 


GCT 


TCG 


GGC 


ATC 


TAT 


CAC 


ATC 


ACC 


39 


AAT 


GAT 


TGT 


CCG 


AAT 


TCC 


AGT 


ATA 


GTC 


TAT 


GAA 


GCT 


GAC 


78 


CAT 


CAC 


ATC 


CTA 


CAC 


TTG 


CCG 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


ATG 


ACT 


GGG 


AAC 


ACA 


TCG 


CGT 


TGC 


TGG 


ACG 


CCG 


GTG 


ACG 


156 


CCT 


ACA 


GTG 


GCT 


GTC 


GCA 


CAC 


CCG 


GGC 


GCT 


CCG 


CTT 


GAG 


195 


TCG 


TTC 


CGG 


CGA 


CAT 


GTG 


GAC 


TTA 


ATG 


GTA 


GGC 


GCG 


GCC 


234 


ACT 


TTG 


TGT 


TCT 


GCC 


CTC 


TAT 


GTT 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


GGT 


GCC 


TTC 


CTG 


ATG 


GGG 


CAG 


ATG 


ATC 


ACT 


TTT 


CGG 


CCG 


312 


CGT 


CGC 


CAC 


TGG 


ACC 


ACG 


CAG 


GAG 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


ACC 


ACT 


CTG 


CTC 


CTC 


429 


GCC 


CAG 


ATC 


ATG 


AGG 


GTC 


CCC 


ACA 


GCC 


TTT 


CTC 


GAC 


ATG 


468 


GTT 


GCC 


GGA 


GGC 


CAC 


TGG 


GGC 


GTC 


CTC 


GCG 


GGC 


TTG 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GGC 


AAT 


TGG 


GCC 


AAG 


GTA 


GTC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTT 


GCT 


GGG 


GTA 


GAC 


GCC 








576 



^ (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 



35 



GTG CAC TAC CGG AAT GCT TCG GGC GTC TAT CAT GTC ACC 39 
AAT GAT TGC CCT AAC ACC AGC ATA GTG TAC GAG ACG GAG 78 
CAC CAC ATC ATG CAC TTG CCA GGG TGT GTC CCC TGT GTG 117 



•WO 95/01442 



PCT/US94/07320 



5 



CGG 


ACG 


GAG 


AAT 


ACT 


TCT 


CGC 


• 74 
TGC 


TGG 


GTG 


CCC 


TTG 


ACC 


156 


CCC 


ACT 


GTG 


GCC 


GCG 


CCC 


TAT 


CCC 


AAC 


GCA 


CCG 


TTA 


GAG 


195 


TCC 


ATG 


CGC 


AGG 


CAT 


GTA 


GAC 


CTG 


ATC 


GTG 


GGT 


GCG 


GCT 


234 


ACT 

n\* x 


ATG 


TGT 


TCC 


GCC 


TTC 


TAC 


ATT 


GGA 


GAT 


CTG 


TGT 


GGA 


273 




GTC 


TTC 


CTA 


GTG 

w X w 


GGC 

WWW 


CAG 


CTG 


TTC 

X X W 


GAC 


TTC 

X X w 


CGA 


CCG 


312 




CGG 


CAC 


TGG 


ACC 


ACC 


CAG 


GAT 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAT 


CCT 


GGT 


CAC 


GTC 


TCG 


GGC 


CAC 


AGG 


ATG 


GCC 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


AGC 


GCG 


CTG 


ATT 


ATG 


429 


GCT 


GAG 


ATC 


TTA 


CGG 


ATC 


CCC 


TCT 


ATC 


CTA 


GGT 


GAC 


TTG 


468 


CTC 


ACC 


GGG 


GGT 


CAC 


TGG 


GGA 


GTT 


CTT 


GCT 


GGT 


CTA 


GCT 


507 


TTC 


TTC 


AGC 


ATG 


CAG 


AGT 


AAC 


TGG 


GCG 


AAG 


GTC 


ATC 


CTG 


546 


GTC 


CTA 


TTC 


CTC 


TTT 


GCC 


GGG 


GTC 


GAG 


GGA 








576 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



GTT 


AAC 


TAT 


CGC 


AAT 


GCC 


TCG 


GGC 


GTC TAT 


CAC 


GTC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTG TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAG 


ATC 


TTA 


CAC 


CTC 


CCA 


GGG 


TGC TTG 


CCC 


TGT 


GTG 


117 


AGG 


GTT 


GGG 


AAT 


CAG 


TCA 


CGC 


TGC 


TGG GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GTG 


TCT 


TAT 


ATC 


GGT GCT 


CCG 


CTT 


GAC 


195 


TCC 


CTC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG GTG 


GGC 


GCC 


GCT 


234 


ACT 


GTA 


TGC 


TCT 


GCC 


CTC 


TAC 


GTT 


GGA GAT 


CTG 


TGC 


GGT 


273 


GGT 


GCA 


TTC 


TTG 


GTT 


GGC 


CAG 


ATG 


TTC TCC 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC AAT 


TGT 


TCT 


ATC 


351 


TAC 


GCA 


GGG 


CAT 


ATC 


ACG 


GGC 


CAC 


AGG ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC ACC 


CTG 


CTT 


CTC 


429 


GCC 


CAG 


GTC 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT CTG 


GTA 


GAT 


CTA 


468 


CTC 


GCT 


GGA 


GGG 


CAC 


TGG 


GGC 


GTC 


CTT GTT 


GGG 


TTG 


GCG 


507 


TAC 


TTC 


AGT 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC AAA 


GTC 


ATC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTC 


GCT 


GGA 


GTT 


GAT GCC 
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30 (2) INFORMATION FOR SEQ ID NO: 43: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 



.WO 95/01442 



PCT/OS94/07320 



- 75 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



5 



GTC 


AAC 


TAT 


CAC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


ATC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


ATG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAC 


ATC 


CTA 


CAC 


CTC 


CCA 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


AGG 


GAG 


GGG 


AAC 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GCG 


CCT 


TAT 


ATC 


GGT 


GCA 


CCG 


■CTT 


GAA 


195 


TCC 


ATC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTA 


GGC 


GCT 


GCT 


234 


ACA 


GTG 


TGC 


TCC 


GCT 


CTC 


TAC 


ATT 


GGG 


GAC 


CTG 


TGC 


GGT 


273 


GGC 


GTA 


TTT 


TTG 


GTT 


GGT 


CAG 


ATG 


TTT 


TCT 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAT 


GCG 


GGG 


CAC 


GTT 


ACA 


GGC 


CAC 


AGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


TTG 


GTC 


CTC 


429 


GCC 


CAG 


GTT 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTG 


GAC 


CTA 


468 


CTC 


ACT 


GGA 


GGG 


CAC 


TGG 


GGT 


ATC 


CTT 


ATC 


GGG 


GTG 


GCA 


507 


TAC 


TTC 


TGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTC 


ATT 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TAC 


GCT 


GGA 


GTT 


GAT 


GCC 








576 



J5 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



25 



TAC 


AAC 


TAT 


CGC 


AAC 


AGC 


TCG 


GGT 


GTC 


TAC 


CAT 


GTC 


ACC 


39 


AAC 


GAT 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTC 


TAT 


GAA 


ACC 


GAT 


78 


TAC 


CAC 


ATC 


TTA 


CAC 


CTC 


CCG 


GGA 


TGC 


GTT 


CCT 


TGC 


GTG 


117 


AGG 


GAA 


GGG 


AAC 


AAG 


TCT 


ACA 


TGC 


TGG 


GTG 


TCT 


CTC 


ACC 


156 


CCC 


ACC 


GTG 


GCT 


GCG 


CAA 


CAT 


CTG 


AAT 


GCT 


CCG. 


CTT 


GAG 


195 


TCT 


TTG 


AGA 


CGT 


CAC 


GTG 


GAT 


CTG 


ATG 


GTG 


GGC 


GGC 


GCC 


234 


ACT 


CTC 


TGC 


TCC 


GCC 


CTC 


TAC 


ATC 


GGA 


GAC 


GTG 


TGT 


GGG 


273 


GGT 


GTG 


TTC 


TTG 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


CAA 


CCT 


312 


CGC 


CGC 


CAC 


TGG 


ACC 


ACC 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACA 


GGA 


CAT 


ATC 


ACA 


GGA 


CAC 


AGA 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


AGC 


CCC 


ACT 


GCG 


ACG 


CTG 


GTC 


CTC 


429 


GCC 


CAA 


CTT 


ATG 


AGG 


ATC 


CCA 


GGC 


GCC 


ATG 


GTC 


GAC 


CTG 


468 


CTT 


GCA 


GGC 


GGC 


CAC 


TGG 


GGC 


ATT 


CTG 


GTT 


GGC 


ATA 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTT 


ATC 


CTG 


546 


GTC 


CTG 


TTT 


CTC 


TTT 


GCT 


GGA 


GTC 


GAC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 45: 



1*0 95/01*12 



PCT/US94/07320 



- 76 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



GTT 


CCC 


TAC 


CGG 


AAT 


GCC 


TCT 


GGG 


GTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAC 


TGC 


CCA 


AAC 


TCC 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AGC 


CTG 


ATC 


TTG 


CAC 


GCA 


CCT 


GGC 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


1SS 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


ACC 


TTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGA 


GCT 


234 


GCT 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGC 


GAC 


GCG 


TGC 


GGG 


273 


GCA GTG 


TTT 


CTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACC 


ACA 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


ATG 


CTA 


CGG 


ATC 


CCC 


CAG 


GTG 


GTC 


ATA 


GAC 


ATC 


468 


ATA 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAC 


TTT 


GCG 


TCG 


GCC 


GCC 


AAC 


TGG 


GCT 


AAG 


GTA 


GTG 


CTG 


546 


GTT 


CTG 


TTC 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 46: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

Z5 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



GTT 


CCC 


TAC 


CGA 


AAC 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTT 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATC 


TTG 


CAT 


GCA 


CCT 


GGT 


TGC 


GTG 


CCT 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AAG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACG 


TTG 


TCA 


GCC 


CCG 


AAT 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACG 


GCC 


TTG 


CTG 


ATG 


429 
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GCC CAG TTG CTA CGG ATT CCC CAG GTG GTC ATC GAC ATC 468 

ATT GCC GGG GGC CAC TGG GGG GTC TTG TTT GCC GCC GCA 507 

TAT TTC GCG TCA GCG GCT AAC TGG GCT AAG GTT ATA CTG 546 

GTC TTG TTT CTG TTT GCG GGG GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 47: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE : SA5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



15 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATT 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AAG 


GAA 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GTC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTC 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGC 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


GTG 


CTA 


CGG 


ATT 


CCC 


CAA 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GTC 


GCA 


507 


TAC 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 48: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



35 



GTT CCT TAC CGG AAT GCC TCT GGG GTG TAT CAT GTT ACC 39 

AAT GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCT GAT 78 

GAC CTG ATC CTA CAC GCA CCT GGC TGC GTG CCC TGT GTC 117 

CGG AAG GAT AAT GTC AGT AGA TGC TGG GTT CAT ATC ACC 156 



„WO 55/0J44Z 



PCT/US5 4/07320 
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ccc 


ACA 


CTA 


lUi 




pro 




PTC 




GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 




Abb 




ul 1 


PIT 


Tar 






GGA GGG 


GCC 


234 


GCC 


CTG 


TGC 


TCC 




TTA 










GTG 


TGC 


GGG 


273 


GCA 


TTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


GCT 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCC 


ATT 


351 




AGT 


GGC 


CAT 


ATC 


ACT 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCC 


GCG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAA 


ATG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCT 


GCA 


507 


TAC 


TTC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTT 


GAT 


GCC 
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(2) 


INFORMATION FOR SEQ ID 


NO: 49: 













(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNES S : single 
{D ) TOPOLOGY : linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCC 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCG 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAC 


78 


AAC 


CTG 


ATC 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGA 


CAA 


AAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


CTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCG 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAG 


ATG 


TTC 


AGC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) v LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



WO 95/01.442 



PCT/US94/07320 



- 79 - 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50: 



GTT 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATC 


GTC 


TAC 


GAG 


GCT 


GAT 


7fl 


GAC 


CTG 


ATC 


TTA 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


AGG 


CAG 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAG 


ATC 


ACC 


156 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGG 


GGG 


GCT 


234 


GCC 


CTT 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGT 


CAA 


ATG 


TTC 


ACC 


TAT 


AGC 


CCT 


312 


CGC 


CGG 


CAT 


AAT 


GTT 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAC 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


TTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT GAC 


ATC 


468 


ATT 


GCC 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAC 


TAC 


GCG 


TCG 


GCG 


GOT 


AAC 


TGG 


GCC 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

,* (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hamosapiens 
(C) INDIVIDUAL ISOLATE: HK2 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



25 



CTT 


ACC 


TAC 


GGC 


AAC 


TCC 


AGT 


GGG 


CTA 


TAC 


CAT 


CTC 


ACA 


39 


AAT 


GAT 


TGC 


CCC 


AAC 


TCC 


AGC 


ATC 


GTG 


CTG 


GAG 


GCG 


GAT 


78 


GCT 


ATG 


ATC 


TTG 


CAT 


TTG 


CCT 


GGA 


TGC 


TTG 


CCT 


TGT 


GTG 


117 


AGG 


GTC 


GAT 


GAT 


CGG 


TCC 


ACC 


TGT 


TGG 


CAT 


GCT 


GTG 


ACC 


156 


CCC 


ACC 


CTG 


GCC 


ATA 


CCA 


AAT 


GCT 


TCC 


ACG 


CCC 


GCA 


ACG 


195 


GGA 


TTC 


CGC 


AGG 


CAT 


GTG 


GAT 


CTT 


CTT 


GCG 


GGC 


GCC 


GCA 


234 


GTG 


GTT 


TGC 


TCA 


TCC 


CTG 


TAC 


ATC 


GGG 


GAC 


CTG 


TGT 


GGC 


273 


TCT 


CTC 


TTT 


TTG 


GCG 


GGA 


CAA 


CTA 


TTC 


ACC 


TTT 


CAG 


CCC 


312 


CGC 


CGT 


CAT 


TGG 


ACT 


GTG 


CAA 


GAC 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAT 


ACA 


GGC 


CAC 


GTC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCC 


ACA 


ACC 


ACT 


CTG 


GTC 


CTA 


429 


TCT 


AGC 


ATC 


TTG 


AGG 


GTA 


CCT 


GAG 


ATT 


TGT 


GCG 


AGT 


GTG 


468 


ATA 


TTT 


GGT 


GGC 


CAT 


TGG 


GGG 


ATA 


CTA 


CTA 


GCC 


GTT 


GCC 


507 


TAC 


TTT 


GGC 


ATG 


GCT 


GGC 


AAC 


TGG 


CTA 


AAA 


GTT 


CTG 


GCT 


546 


GTT 


CTG 


TTC 


CTA 


TTT 


GCA 


GGG 


GTT 


GAA 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
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(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

( D ) TOPOLOGY : unknown 



(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID' NO: 52: 



Tyr Gin 


Val 


Arcr 


Asn 


Ser 


Thr 


Gly 


Leu 


Tvr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


AST} 


Ala lie Leu 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cvs 


Val 


Arcr 


Glu 


Gly 


Asn Val Ser 








35 










40 






45 


Arg Cys 


Trp 


val 


Ala 


Met 


Thr 


Pro 


Thr 


val 


Ala 


Thr 


Arg Asp Gly 








50 










55 






60 


Lys Leu 


Pro 


Thr 


Ala 


Gin 


Leu 


Arg 


Arg 


His 


lie 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 








110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg He Pro 








140 










145 






150 


Gin Ala 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Leu Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 





185 190 



25 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 

5 10 15 
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Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 








20 










25 




30 


His Ser 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 








35 










40 




45 


Lys Cys 


Trp 


Val 


Ala 


Val 


Ala 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 








50 










55 






60 


Lys Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


lie 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr Pro Gly 








110 










115 






120 


His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg lie Pro 








140 










145 






150 


Gin Ala 


lie 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 




165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Val Val 


Val 


Leu 


Leu 


Leu 


Phe 


Thr 


Gly 


Val 


Asp 


Ala 










185 








190 







(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



His Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala lie Leu 








20 










25 






30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 








50 










55 






60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


lie 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 



95 100 105 
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5 



Arg 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn Cys 


Ser 


He Tyr Pro Gly 










110 








115 




120 


His 


lie 


Thr 


Gly 


His 


Arg 


Met: 


Ala 


Trp Asp 


Met 


Met Met Asn Tip 










125 








130 




135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala Gin 


Leu 


Leu Arg He Pro 










140 








145 




150 


Gin 


Ala 


He 


Leu 


Asp 


Met 


lie 


Ala 


Gly Ala 


His 


Trp Gly Val Leu 










155 








160 




165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val Gly 


Asn 


Trp Ala Lys Val 








170 








175 




180 


Val 


Val 


val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Ala 



185 190 



(2) INFORMATION FOR SEQ ID NO: 55: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 



His 


Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr 


Asn Asp 
15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


He Leu 








20 










25 








30 


His 


Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly 


Asn 


Thr Ser 
45 


Arg 


Cys 


Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 






50 










55 








60 


Lys 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu Val 








65 










70 








75 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 










85 








90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 








95 










100 








105 


His 


His 


Trp 


Thr 


Thr 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Pro Gly 
120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 
135 










125 










130 








Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg 


He Pro 
150 


Gin 


Ala 


lie 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 










155 










160 








165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 








170 










175 








180 


Leu 


Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 







185 190 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



Tyr Gin 


val 


Arg 


Asn 

s 


Ser 


inr 


Gly 


Leu 


Tyr 
10 


HIS 


val Tnr 


Asn 


Asp 
15 


uys tro 


Asn 


ber 


ser 


lie 


Val 


Tyr 


GlU 


Thr 


Ala 


Asp Ala 


lie 


T - . 

Leu 








20 








25 






30 


His Ala 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly Asn 


Thr 


Ser 
45 


Arg Cys 


Trp 


Val 


Ala 
50 


Met 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr Arg 


Asp 


Gly 
60 


Lys Leu 


Pro 


Ala 


Thr 
65 


Gin 


Leu 


Arg 


Arg 


Tyr 
70 


He 


Asp Leu 


Leu 


Val 
75 


Gly Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly Asp 


Leu 


Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Leu 


Phe 
100 


Thr 


Phe Ser 


Pro 


Arg 
105 


Arg Leu 


Trp 


Thr 


Thr 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


lie Tyr 


Pro 


Gly 
120 


His He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met Met 


Asn 


Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu Arg 


He 


Pro 








140 










145 






150 


Gin Ala 


He 


Leu 


Asp 
155 


Met 


He 


Ala 


Gly 


Ala 
160 


His 


Trp Gly 


Val 


Leu 
165 


Ala Gly 


He 


Ala 


Tyr 
170 


Phe 


Ser 


Met 


Val 


Gly 
175 


Asn 


Trp Ala 


Lys 


Val 
180 


Leu Val 


val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Ala 







(2) INFORMATION FOR SEQ ID NO: 57: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

o S (A) ORGANISM: homosapiens 

JJ (C) INDIVIDUAL ISOLATE: S18 
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15 



- 84 - 

0 (xi) SEQUENCE DESCRIPTION: SEQ ID 110:57: 

Tyr Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Aan Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Thr He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser 
5 35 40 45 

Arg Cys Trp Val Pro Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 

65 70 75 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

30 85 90 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr He Ser Pro Arg 
10 95 100 105 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
110 115 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
125 130 135 

Ser Pro Thr Thr Ala Leu Val He Ala Gin Leu Leu Arg Val Pro 
140 145 150 

Gin Ala Val Leu Asp Met He Ala Gly Ala His Trp Gly Val Leu 
155 160 165 

Ala Gly He Ala Tyr Phe Ser Met Ala Gly Asn Trp Ala Lys Val 
170 175 180 

Leu Leu Val Leu Leu Leu Phe Ala Gly Val Asp Ala 
185 190 

20 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 
30 5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Ala He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Asp Gly Ala Pro 

35 40 45 

Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 
~ 65 70 75 
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Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 








80 






Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 






95 






Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 






110 






His lie 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 








140 






Gin Ala 


Val 


Leu 


Asp 


Met 


lie 








155 






Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 






170 






Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 



10 185 



85 


- 








Ala 


Leu 


Tyr 


Val 


Gly Asp Leu Cys 






85 




90 


Gin 


Leu 


Phe 


Thr 


Phe Ser Pro Arg 






100 




105 


Cys 


Asn 


Cys 


Ser 


He Tyr Pro Gly 






115 




120 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 






130 




135 


val 


Ala 


Gin 


Leu 


Leu Arg He Pro 






145 




150 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 






160 




165 


Met 


Val 


Gly 


Asn 


Trp Ala Lys Val 






175 




180 


Ser 


Gly 


Val 


Asp 


Ala 






190 







(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

( A) ORGANI SM : homos ap i ens 
(C) INDIVIDUAL ISOLATE: US11 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



25 



Tyr Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 




5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 






35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Met 


Thr. 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 




50 










55 






60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 






65 










70 






75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 








100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 




110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 








130 






135 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Ala 


He 


Leu 


Asp 
155 


Met 


lie 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 
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Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 
170 175 180 

Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



Tyr Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 






20 






His Thr 


Pro 


Gly 


Cys 


Val 


Pro 








35 






Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 








50 






Asn Val 


Pro 


Thr 


Thr 


Ala 


He 








65 






Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 








80 






Gly Ser 


Val 


Phe 


Leu 


He 


Ser 






95 






Arg His 


Glu 


Thr 


Val 


Gin 


Glu 






110 






His Val 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 








140 






Gin Ala 


Val 


Met 


Asp 


Met 


Val 








155 






Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 








170 






Leu lie 


Val 


Met 


Leu 


Leu 


Phe 



30 185 



Gly 


val 


Tyr 


His 


Val 


Thr Asn Asp 






10 






15 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






25 






30 


Cys 


Val 


Arg 


Glu 


Asp 


Asn Ser Ser 






40 






45 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Gly 






55 






60 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 






70 






75 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






85 






90 


Gin 


Leu 


Phe 


Thr 


Leu 


Ser Pro Arg 






100 






105 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






115 






120 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






130 






135 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 






145 






150 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 






160 






165 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






175 






180 


Ala 


Gly 


Val 


Asp 


Gly 





190 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hamosapiens 
(C) INDIVIDUAL ISOLATE: D3 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 61: 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val Tyr Gin 


Val Thr Asn Asp 








5 








10 




Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Thr Ala 


Asd Met He Met 

• mm 4k 4»^^ Mb 4b 4fc>^r %r 








20 








25 




His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Arg Glu 


Asn Asn Ser S&t* 

*ms£/ nsii ucjl 








35 








40 


45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr Leu Ala 


n^Ck nJ.M OC1 








SO 








55 




Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg His Val 


JJCU UCU. VdX 








65 








70 


/ D 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr Val 


Gly Asp Leu Cys 








80 








85 


y u 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu Phe Thr 


Phe Ser Pro Arg 








95 








100 


105 


Arg His 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn Cys Ser 


He Tyr Pro Gly 








110 








115 


120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp Met 


Met Met Asn Tzp 








125 








130 


135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser Gin Leu 


Leu Arg He Pro 








140 








145 


150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala His 


Trp Gly Val Leu 








155 








160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly Asn 


Trp Ala Lys Val 








170 








175 


180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val Asp 


Gly 








185 








190 



(2) INFORMATION FOR SEQ ID NO: 62: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Val Asp Val He Met 
M 20 25 30 
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88 



10 



15 



30 



His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg Glu 


Asn 


Asn His Ser 






35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu Ala 


Ala 


Arg Asn Ala 




50 










55 




60 


Ser lie 


Pro 


Thr 


Thr 
65 


Thr 


lie 


Arg 


Arg 


His Val 
70 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr Val 


Gly 


Asp Leu Cys 






80 










85 




90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe Thr 


Phe 


Ser Pro Arg 






95 










100 




105 


Arg His 


Glu 


Thr 


Ala 


Gin 


Asp 


Cys 


Asn 


Cys Ser 


lie 


Tyr Pro Gly 






110 










115 




120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met 


Met 


Met Asn Trp 






125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Leu 


Ser 


Gin Leu 
145 


Leu 


Arg lie Pro 
150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala His 
160 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly Asn 


Trp 


Ala Lys Val 






170 










175 




180 


Leu lie 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val Asp 
190 


Gly 





(2) INFORMATION FOR SEQ ID NO: 63; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 
20 (D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 



25 



35 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 63: 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


lie 


Tyr 


His 


Val 


Thr Asn Asp 






5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


Val 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn Ser Ser 






35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Val 






50 










55 






60 


Ser val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 






110 










115 






120 
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■4 



5 









- 89 


- 










His Val Ser 


Gly Hia 


Arg Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 




125 








130 






135- 


Ser Pro Thr 


Ala Ala 


Leu Val 


Val 


Ser 


Gin 


Leu 


Leu Arg 


Ile Pro 




140 








145 






150 


Gin Ala Val 


Val Asp 


Met Val 


Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 




155 






160 






165 


Ala Gly Leu 


Ala Tyr 


Tyr Ser 


Met 


Val 


Gly 


Asn 


Trp Ala 


Lys Val 




170 








175 






180 


Leu He Val 


Met Leu 


Leu Phe 


Ala 


Gly 


Val 


Asp 


Gly 






185 






190 









(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: HK4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



His Glu 


Val 


His 


Asn 


Val 


Ser 


Gly 


lie 


Tyr 


His 


Val 


Thr 


Asn Asp 








5 










10 








15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met 


lie Met 






20 










25 








30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn 


Ser Ser 






35 










40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn Ala 




50 










55 








60 


Ser He 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu Val 








65 










70 








75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu Cys 






80 










85 








90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 






95 










100 








105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr 


Pro Gly 






110 










115 








120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 






125 










130 








135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


Leu Pro 








140 










145 








150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val Leu 








155 










160 








165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 






170 










175 








180 


Leu He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 







185 190 



35 



<W0 9S/M442 PCT/USW/07320 



10 



15 



20 



- 90 - 

(2) INFORMATION FOR SEQ ID NO: 65 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hamosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO:65: 






rrt-r-r* Hi "(Tan 

IyX KaXu VaX 


Arg 


Asn 


Val 


Ser 


Gly Val 


Tyr 


His 


Val Thr 


Asn 


Asp 


5 








10 








15 


Cys Ser Asn 


Leu 


Ser 


lie 


Val 


Tyr Glu 


Thr 


Thr 


Asp Met 


He 


Met 




20 








25 








30 


xixs inr iriO 


Gly 


Cys 


Val 


Pro 


Cys Val 


Arg 


Glu 


Asn Asn 


Ser 


Ser 




35 








40 








45 


Arg Cys Trp 


Val 


Ala 


Leu 


Ala 


Pro Thr 


Leu 


Ala 


Ala Arg Asn 


Ala 




50 








55 








60 


Ser Val Pro 


Thr 


Thr 


Ala 


lie 


Arg Arg 


His 


Val 


Asp Leu 


Leu 


Val 






65 








70 








75 


Gly Ala Ala 


Ala 


Phe 


Cys 


Ser 


Ala Met 


Tyr 


Val 


Gly Asp 


Leu 


Cys 




80 








85 








90 


Gly Ser Val 


Phe 


Leu 


Val 


Ser 


Gin Leu 


Phe 


Thr 


Phe Ser 


Pro 


Arg 




95 








100 








105 


Arg fiis Glu 


Thr 


Val 


Gin 


Asp 


Cys Asn 


Cys 


Ser 


He Tyr 


Pro 


Gly 




110 








115 








120 


His Val Thr 


Gly 


His 


Arg 


Met 


Ala Trp 


Asp 


Met 


Met Met 


Asn 


Trp 




125 








130 








135 


Ser Pro Thr 


Thr 


Ala 


Leu 


Val 


Val Ser 


Gin 


Leu 


Leu Arg 


He 


Pro 




140 








145 








150 


Gin Ala Val 


Val 


Asp 


Met 


Val 


Ala Gly 


Ala 


His 


Trp Gly Val 


Leu 






155 








160 








165 


Ala Gly Leu 


Ala 


Tyr 


Tyr 


Ser 


Met Val 


Gly 


Asn 


Trp Ala 


Lys 


Val 




170 








175 








180 


Leu He Val 


Met 


Leu 


Leu 


Phe 


Ala Gly 


Val 


Asp 


Gly 










185 








190 











25 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapi ens 
(C) INDIVIDUAL ISOLATE: HK8 

35 



WO 95/0J442 



PCTYUS94/07320 



- 91 - 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO:66: 




Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


lie Tyr His 


Val Thr Asn 


Asp 








5 








10 




15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Thr Ala 


Asp Met lie 


Met 






20 








25 




30 


His Thr 


Pro 


Gly 


Cys 


Met 


Pro 


Cys 


Val Arg Glu 


Asn Asn Ser 


Ser 








35 








40 




45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr Leu Ala 


Ala Arg Asn Val 








50 








55 




60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg His Val 


Asp Leu Leu 


Val 








65 








70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr Val 


Gly Asp Leu 


Cys 








80 








85 




90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu Phe Thr 


Phe Ser Pro 


Arg 






95 








100 




105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn Cys Ser 


lie Tyr Pro 


Gly 








110 








115 




120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp Met 


Met Met Asn 


Trp 








125 








130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser Gin Leu 


Leu Arg lie 


Pro 








140 








145 




150 


Gin Ala 


lie 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala His 


Trp Gly Val 


Leu 








155 








160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly Asn 


Trp Ala Lys Val 








170 








175 




180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val Asp 


Gly 





185 190 



20 (2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 • (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

30 5 10 15 

Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Met lie Met 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ser Ser 

35 40 45 
Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

50 55 60 

Ser Val Ser Thr Thr Thr lie Arg His His Val Asp Leu Leu Val 

35 65 70 75 



WO 95/01442 
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(21 xr 11a 
ij.L y /UuCi 


Ala 


Ala 


Phe 








m v Car 


Val 


Phe 


Leu 






95 


arer His 


Glu 


Thr 


Val 






110 


His Val 


Ser 


Glv 


His 








125 


Ser Pro 


Thr 


Ala 


Ala 








140 


Gin Ala 


Val 


Val 


Asp 








155 


Ala Gly 


Leu 


Ala 


Tyr 








170 


Leu lie 


Val 


Met 


Leu 








185 







■ 92 






Cys 


Ser 


Ala 


Met 


Tvr 










o 3 


val 


Cor 


(21 n 


IlcU 














(21 n 


TV er\ 




Asn 












lie 


Arcr 


Met 


Ma 




Asp 










■LJ U 


Leu 


Val 


Val 


Ser 


Gin 










145 


Met 


Val 


Ala 


Gly 


Ala 








160 


Tyr 


Ser 


Met 


Val 


Gly 










175 


Leu 


Phe 


Ala 


Gly 


Val 










190 



Val 


Glv Aso Leu Cvs 




90 


Thr 


Phe Ser Pro Arcr 




105 


Ser 


lie Tyr Pro Gly 




120 


Met 


Met Met Asn Trp 




135 


Leu 


Leu Arg lie Pro 




150 


His 


Trp Gly lie Leu 




165 


Asn 


Trp Ala Lys Val 




180 


Asp 


Gly 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND8 

20 (Xi) SEQUENCE DESCRIPTION: SEQ 3D NO: 68: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 




5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Phe Ser 






35 










40 






45 


Ser Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ala 




50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 








140 










145 






150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly He Leu 



155 160 165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 
170 175 180 

Leu lie Val Met Leu Leu Phe Ala Gly Val Asp Gly 
185 190 



(2) INFORMATION FOR SEQ ID NO: 69: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: P10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



IS 



20 



25 



30 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val Tyr 


His 


Val 


Thr Asn 


Asp 




5 








10 








15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Ala 


Ala 


Asp 


Met lie 


Met 






20 








25 








30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Arg 


Glu 


Asn 


Asn Ser 


Ser 






35 








40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr Leu 


Ala 


Ala 


Arg Asn 


Ser 




50 








55 








60 


Ser Val 


Pro 


Thr 


Thr 
65 


Ala 


lie 


Arg 


Arg His 
70 


Val 


Asp 


Leu Leu 


Val 
75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val 


Gly 


Asp Leu 


Cys 






80 








85 








90 


Gly Ser 


Val 


Leu 


Leu 


val 


Ser 


Gin 


Leu Phe 


Thr 


Phe 


Ser Pro 


Arg 






95 








100 








105 


Arg His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn Cys 


Ser 


lie 


Tyr Pro 


Gly 




110 








115 








120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met 


Met 


Met Asn 


Trp 






125 








130 








135 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser Gin 
145 


Leu 


Leu 


Arg lie 


Pro 
150 


Gin Ala 


lie 


Leu 


Asp 
155 


val 


Val 


Ala 


Gly Ala 
160 


His 


Trp 


Gly Val 


Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly 


Asn 


Trp 


Ala Lys 


Val 






170 








175 








180 


Leu lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly Val 
190 


Asp 


Gly 







(2) INFORMATION FOR SEQ ID NO: 70: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 



WO 95/01,442 
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10 



15 



- 94 - 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: bomosapiens 
(C) INDIVIDUAL ISOLATE: S9 



(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 70: 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Ala 


Tyr 


TJ-I f=» 
IllS 


Val Thr Asn Asp 






b 










in 




15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


val 


Tyr 


Glu 


Aia 


Aia 


Asp Val lie Met 






20 










25 




30 


His Thr 


Pro 


Gly 


Cys 


"IT— 1 

Val 


Pro 


Cys 


val 


Gin 


VjlU 


Gly Asn Ser Ser 






35 














45 


Gin Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


T an 

Leu 


Aia 


Ala Arg Asn Ala 




50 










cc 




60 


Thr Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


HIS 


vai 


Asp Leu Leu Val 








OD 










70 




75 


Gly Ala 


Ala 


Val 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly Asp Leu Cys 






80 










85 




90 


Gly Ser 


Val 


Phe 


Leu 


lie 


Ser 


Gin 


Leu 


Phe 


Thr 


lie Ser Pro Arg 






95 










100 




105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asn 


Cys 


Asn 


Cys 


Ser 


lie Tyr Pro Gly 






110 










115 




120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 






125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu Arg He Pro 






140 










145 




150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 








155 










160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala Lys Val 






170 








175 




180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 



20 

Leu He Val Met 

185 190 

(2) INFORMATION FOR SEQ ID NO: 71: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Tyr Glu Val Arg Asn Val Ser Gly Ala Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Val Asp Val He Leu 
35 1 20 25 30 
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10 



15 



20 



25 



30 



His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg Glu Asn Asn Ser Ser 








35 










40 45 


Arg 


Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu Ala Ala Arg Asn Ser 
55 60 


Ser 


Val 


Pro 


Thr 


Thr 
65 


Thr 


He 


Arg 


Arg 


His Val Asp Leu Leu Val 
70 75 


Gly 


Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr Val Gly Asp Leu Cys 
85 90 


Gly 


Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe Thr Phe Ser Pro Arg 
100 105 


Arg 


His 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys Ser He Tyr Pro Gly 
115 . 120 


His 


Val 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp Met Met Met Asn Trp 
130 135 


Ser 


Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin Leu Leu Arg He Pro 
145 150 


Gin 


Ala 


Val 


val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala His Trp Gly Val Leu 
160 165 


Ala 


Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly Asn Trp Ala Lys Val 
175 180 


Leu 


lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val Asp Gly 
190 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 



35 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:72: 


Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Met 


Tyr 


His 


Val 


Thr Asn Asp 










5 










10 






15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met He Met 








20 










25 






30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn Ser Ser 










35 










40 






45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ser 








50 










55 






60 


Ser 


Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 










65 










70 






75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


Arg 


Tyr 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 
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5 



Arg Val 


Thr 


Gly His Arg Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 




125 






130 




135 


Ser Pro 


Thr 


Thr Ala Leu Val 


Val 


Ser 


Gin 


Leu 


Leu Arg lie Pro 






140 






145 




150 


Gin Ala 


lie 


Val Asp Met Val 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 






155 






160 




165 


Ala Gly Leu 


Ala Tyr Tyr Ser 


Met 


Val 


Gly 


Asn 


Trp Ala Lys Val 






170 






175 




180 


Leu lie 


Val 


Met Leu Leu Phe 


Ala 


Gly 


Val 


Asp 


Gly 






185 






190 







(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 



15 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73; 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 




5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Ala 


Asn Ser Ser 






35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Thr 




50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 
65 


Thr 


He 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Val 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 






110 










115 






120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






170 










175 






180 


Leu lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 





35 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown. 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


Tyr 


Val 


Thr 


Asn Asp 








5 










10 








15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met 


lie Met 






20 










25 








30 


His Thr 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Glu 


Ser 


Asn 


Ser Ser 








35 










40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn Ala 








50 










55 








60 


Ser Val 


Pro 


Thr 


Lys 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu Val 








65 










70 








75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 










85 








90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 






95 










100 








105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 








110 










115 








120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 








125 










130 








135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He Pro 








140 










145 








150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val Leu 








155 










160 








165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 






170 










175 








180 


Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 







185 190 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) ' LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: T10 
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10 



15 



- 98 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



Tyr Glu 


vai 






Val 


Sat* 


Glv 
uxy 


Met 


Tyr His 


Val 


Thr 


Asn Asp 
















xu 






il ml 


Cys Ser 


Asn 


C a v* 
OCi 


OCX 




VAX 






Ala Ala 


iuu 




lie Met 






o ft 


















His Thr 


XT ITU 


vjxy 




V GLJL 


nu 




Val 


At*ct CX"\ 11 


Glv 


Ann 


Ser Ser 






OS 










Aft 








Arg Cys 




Val 




JJCU 




-t i. W 






Ala 


At^ct 
/*xy 


Asn Thr 




c ft 










DO 






SO 


Ser Val 


Pro 


Thr 


Thr 

OD 


Thr 


lie 


Arg 


Arg 


His Val 

*7ft 
/U 


Asp 


Leu 


Leu Val 

/ D 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr Val 


Gly 


Asp 


Leu Cys 






oU 










Q C 






Qft 


Gly Ser 


vax 


■file 


Leu 


v=»i 


C A V 




Lieu 






Car 


pyn Atvt 

jet x \j ax y 
















XUU 








Arg His 


Glu 


Thr 


Leu 


Gin 


Asp 


.Cys 


Asn 


Cys Ser 


lie 


Tyr 


Pro Gly 






110 










XXb 






1 *5 ft 
X^ U 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met 


Met 


Met 


Asn Trp 






125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin Leu 
145 


Leu 


Arg 


lie Pro 
150 


Gin Ala 


Val 


Met 


Asp 
155 


Met 


Val 


Thr 


Gly 


Ala His 
160 


Trp 


Gly 


Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly Asn 


Trp 


Ala 


Lys Val 






170 










175 






180 


Leu lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val Asp 
190 


Gly 







20 (2) INFORMATION FOR SEQ ID NO; 76: 

{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 



30 



35 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 76: 






Tyr Glu Val 


Arg Asn Val 


Ser Gly Met Tyr His 


Val Thr 


Asn Asp 


5 


10 






15 


Cys Ser Asn 


Ser Ser lie 


Val Tyr Glu Ala Ala 


Asp Met 


He 


Met 


20 


25 






30 


His Thr Pro 


Gly Cys Val 


Pro Cys Val Arg Glu 


Asn Asn 


Ser 


Ser 




35 


, 40 






45 


Arg Cys Trp 


Val Ala Leu 


Thr Pro Thr Leu Ala 


Ala Arg 


Asn 


Ala 


50 


55 






60 


Ser Val Pro 


Thr Thr Thr 


He Arg Arg His Val 


Asp Leu 


Leu 


Val 




65 


70 






75 
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Gly Ala Ala Thr Phe Cys Ser Ala 


Met 


Tyr 


Val 


Gly Asp 


Leu Cys 


80 




85 






90 


Gly Ser Val Phe Leu lie Ser Gin 


Leu 


Phe 


Thr 


Phe Ser 


Pro Arg 


95 




100 






105 


Gin Hxs Glu Thr Val Gin Asp Cys 


Asn 


Cys 


Ser 


lie Tyr 


Pro Gly 


110 




115 






120 


His Val Ser Gly His Arg Met Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 


125 




130 






135 


Ser Pro Thr Ala Ala Leu Val Val 


Ser 


Gin 


Leu 


Leu Arg 


lie Pro 


140 




145 






150 


Gin Ala Val Met Asp Met Val Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 


155 


160 






165 


Ala Gly Leu Ala Tyr Tyr Ser Met 


Val 


Gly 


Asn 


Trp Ala 


Lys Val 


170 




175 






180 


Leu lie Val Leu Leu Leu Phe Ala 


Gly 


Val 


Asp 


Gly 




185 




190 









(2} INFORMATION FOR SEQ ID NO: 77: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNES S : unknown 
(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

( A ) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: T2 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



Ala Gin 


Val 


Arg 


Asn 


Thr 


Ser 


Arg 


Gly 


Tyr 


Met 


Val 


Thr 


Asn Asp 






5 










10 








15 


Cys Ser 


Asn 


Glu 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala 


Val Leu 






20 








25 








30 


His Val 


Pro 


Gly 


Cys 
35 


He 


Pro 


Cys 


Glu 


Arg 
40 


Leu 


Gly 


Asn 


Thr Ser 
45 


Arg Cys 


Trp 


He 


Pro 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Arg 


Gin Pro 




50 










55 








60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met 


Val Val 






65 










70 








75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 








85 








90 


Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


lie 


Val 


Ser 


Pro Arg 






95 










100 








105 


Arg His 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr 


Pro Gly 




110 










115 








120 


Thr lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 






125 










130 








135 


Ser Pro 


Thr 


Ala 


Thr 
140 


Met 


lie 


Leu 


Ala 


Tyr 
145 


Ala 


Met 


Arg 


Val Pro 
150 


Glu Val 


lie 


lie 


Asp 
155 


lie 


lie 


Gly 


Gly 


Ala 
160 


His 


Trp 


Gly 


Val Met 
165 



(i) 



15 
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Phe Gly Leu Ala Tyr 
170 

lie Val lie Leu Leu 
185 



- 100 - 



Phe Ser Met Gin Gly 
175 

Leu Ala Ala Gly Val 
190 



Ala Trp Ala Lys Val 
180 

Asp Ala 



(2) INFORMATION FOR SEQ ID NO: 78: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

r Asn Asp 
15 

a Val Leu 

15 ~~ * * - ^ - 25 3Q 

Ser 
45 
Pro 
60 
Val 
75 
Cys 
90 
Gin 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Met 
165 
Val 
180 

Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala 

30 



20 



25 



Lys Asn 


Thr 


Thr 


Asn Ser 


Tyr Met Val 


5 








10 


Asp Ser 


He 


Thr 


Trp Gin 


Leu Gin Ala 


20 








25 


Gly Cys 


Val 


Pro 


Cys Glu 


Lys Thr Gly 


35 








40 


lie Pro 


Val 


Ser 


Pro Asn 


Val Ala Val 


50 








55 


Thr Gin 


Gly 


Leu 


Arg Thr 


His He Asp 


65 






70 


Thr Leu 


Cys 


Ser 


Ala Leu 


Tyr Val Gly 


80 






85 


Met Leu 


Ala 


Ala 


Gin Met 


Phe He Val 


95 








100 


Phe Val 


Gin 


Asp 


Cys Asn 


Cys Ser He 


110 








115 


Gly His 


Arg 


Met 


Ala Trp 


Asp Met Met 


125 








130 


Ala Thr 


Met 


lie 


Leu Ala 


Tyr Ala Met 


140 








145 


Leu Asp 


lie 


Val 


Ser Gly 


Ala His Trp 


155 








160 


Ala Tyr 


Phe 


Ser 


Met Gin 


Gly Ala Trp 


170 








175 


Leu Leu 


Leu 


Ala 


Ala Gly 


Val Asp Ala 


185 








190 



(2} INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

35 jc) STRANDEDNESS : unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

{ A ) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

1 Thr Asn Asp 
15 

a Ala Val Leu 
30 

y Asn Ala Ser 
45 

1 Gin Arg Pro 

10 " " 50 55 60 

rriv &ia Tipn Thr Gin Glv Leu Arg Thr His He Asp Met Val Val 

75 

y Asp Leu Cys 
90 

.e Ser Pro Gin 
105 

_ .e Tyr Pro Gly 

15 Ma " M iin " 115 120 

it Met Asn Trp 
135 

it Arg Val Pro 
150 

:p Gly Val Met 
165 

:p Ala Lys Val 
180 



20 



Lys 


Asn 


Thr Ser Thr Ser 


trim «•» %Mf* *» 

Tyr raeu 


5 




10 


Asp 


Ser 


He Thr Trp Gin 


Leu Gin 


20 




25 


Gly 


Cys 


Val Pro Cys Glu 


Arg Val 


35 




40 


lie 


Pro 


Val Ser Pro Asn 


Val Ala 




50 




55 


Thr 


Gin 


Gly Leu Arg Thr 


His He 




65 




70 


Thr 


Leu 


Cys Ser Ala Leu 


Tyr Val 




80 


85 


Met 


Leu 


Ala Ala Gin Met 


Phe He 




95 




100 


Phe 


Val 


Gin Glu Cys Asn 


Cys Ser 




110 




115 


Gly 


His 


Arg Met Ala Trp 


Asp Met 


125 




130 


Thr 


Thr 


Met He Leu Ala 


Tyr Ala 




140 




145 


He 


Asp 


lie He Ser Gly 


Ala His 




155 




160 


Ala Tyr 


Phe Ser Met Gin 


Gly Ala 




170 




175 


Leu 


Leu 


Leu Thr Ala Gly 


Val Asp 




185 




190 



(2) INFORMATION FOR SEQ ID NO: 80: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

an (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Val Gin Val Lys Asn Thr Ser Thr Ser Tyr Met Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Asp Ser He Thr Trp Gin Leu Glu Ala Ala Val Leu 
35 1 20 25 30 
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10 



15 



25 



30 



His Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu Lys Val 


Gly Asn Thr 


Ser 






35 








40 




45 


Arg Cys 


Trp 


lie 


Pro 


Val 


Ser 


Pro 


Asn Val Ala 


Val Gin Arg 


Pro 




50 








55 




60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr His He 


Asp Met Val 


Val 






65 








70 




75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu Tyr Val 


Gly Asp Phe 


Cys 








80 






85 




90 


Gly Gly 


Met 


Met 


Leu 


Ala 


Ala 


Gin 


Met Phe He 


Val Ser Pro 


Arg 






95 








100 




105 


His His 


Ser 


Phe 


Val 


Gin 


Glu 


Cys 


Asn Cys Ser 


He Tyr Pro 


Gly 








110 






115 




120 


Thr lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp Met 


Met Met Asn 


Trp 






125 








130 




135 


Ser Pro 


Thr 


Ala 


Thr 
140 


Leu 


rie 


Leu 


Ala Tyr Val 
145 


Met Arg Val 


Pro 
150 


Glu Val 


He 


lie 


Asp 
155 


He 


He 


Ser 


Gly Ala His 
160 


Trp Gly Val 


Leu 
165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin Gly Ala 


Trp Ala Lys 


Val 






170 








175 




180 


Val Val 


He 


Leu 


Leu 
185 


Leu 


Ala 


Ala 


Gly Val Asp 
190 


Ala 





(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



35 



Val Glu 


Val 


Arg 


Asn 


lie 


Ser 


Ser 


Ser 


Tyr 


Tyr 


Ala 


Thr Asn Asp 






5 










10 






15 


Cys Ser 


Asn 


Asn 


Ser 


lie 


Thr 


Trp 


Gin 


Leu 


Thr 


Asp 


Ala Val Leu 






20 










25 






30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly Thr Leu 






35 










40 






45 


Arg Cys 


Trp 


lie 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Lys His Arg 




50 










55 






60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Thr 


His 


Val 


Asp 


Val He Val 






65 










70 






75 


Met Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 






80 








85 






90 


Gly Ala 


Val 


Met 


lie 


Val 


Ser 


Gin 


Ala 


Leu 


lie 


lie 


Ser Pro Glu 






95 










100 






105 


Arg His 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr Gin Gly 






110 










115 






120 
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His 


lie Thr Gly His Arg 


Met 


Ala Trp 


Asp Met 


Met Leu Asn Trp 




125 








130 


135 


Ser 


Pro Thr Leu Thr 


Met 


lie 


Leu Ala 


Tyr Ala 


Ala Arg Val Pro 




140 








145 


150 


Glu 


Leu Ala Leu Gin 


Val 


Val 


Phe Gly 


Gly His 


Trp Gly Val Val 




155 








160 


165 


Phe Gly Leu Ala Tyr 


Phe 


Ser 


Met Gin 


Gly Ala 


Trp Ala Lys Val 




170 








175 


180 


lie 


Ala lie Leu Leu 


Leu 


Val 


Ala Gly 


Val Asp 


Ala 




185 








190 





(2) INFORMATION FOR SEQ ID NO: 82: 



(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : unknown 
(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
15 (C) INDIVIDUAL ISOLATE: DK11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



25 



30 



Val Glu 


Val 


Arg 


Cys Ser 


Asn 


Asn 


His Leu 


Pro 


Gly 


His Cys 


Trp 


He 


Gly Ala 


Leu 


Thr 


Met Ala 


Ala 


Thr 


Gly Ala 


Val 


Met 


His His 


His 


Phe 


His He 


Thr 


Gly 


Ser Pro 


Thr 


Leu 


Glu Leu 


Val 


Leu 


Phe Gly 


Leu 


Ala 


He Ala 


lie 


Leu 



Asn 


Thr 


Ser 


Ser 


Ser 


5 

Ser 


He 


Thr 


Trp 


Gin 


20 










Cys 


Val 


Pro 


Cys 


Glu 


35 










Gin 


Val 


Thr 


Pro 


Asn 


50 










His 


Asn 


Leu 


Arg 


Ala 


65 










Val 


Cys 


Ser 


Ala 


Leu 


80 








lie 


Val 


Ser 


Gin 


Ala 


95 










Thr 


Gin 


Glu 


Cys 


Asn 


110 










His 


Arg 


Met 


Ala 


Trp 


125 










Thr 


Met 


He 


Leu 


Ala 


140 










Glu 


Val 


Val 


Phe 


Gly 


155 










Tyr 


Phe 


Ser 


Met 


Gin 


170 










Leu 


Leu 


Val 


Ala 


Gly 



185 



Tyr 


Tyr 


Ala 


Thr Asn Asp 


10 






15 


Leu 


Thr 


Asn 


Ala Val Leu 


25 






30 


Asn 


Asp 


Asn 


Gly Thr Leu 


40 






45 


Val 


Ala 


Val 


Lys His Arg 


55 






60 


His 


He 


Asp 


Met He Val 


70 






75 


Tyr 


Val 


Gly" 


Asp Val Cys 


85 






90 


Phe 


He 


Val 


Ser Pro Glu 


100 






105 


Cys 


Ser 


He 


Tyr Gin Gly 


115 






120 


Asp 


Met 


Met 


Leu Asn Trp 


130 






135 


Tyr 


Ala 


Ala 


Arg Val Pro 


145 






150 


Gly 


His 


Trp 


Gly Val Val 


160 






165 


Gly 


Ala 


Trp 


Ala Lys Val 


175 






180 


Val 


Asp 


Ala 




190 









35 
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(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE : SW3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



Val Glu 


Val 


Arg 


Asn 


He 


Ser 


Ser Ser 


Tyr 


Tyr 


Ala 


Thr Asn Asp 






5 








10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Thr 


Trp Gin 


Leu 


Thr 


Asn 


Ala Val Leu 






20 








25 






30 


His Leu 


Pro Gly 


Cys 


Val 


Pro 


Cys Glu 


Asn 


Asp 


Asn 


Gly Thr Leu 








35 








40 






45 


His Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro Asn 


Val 


Ala 


Val 


Lys His Arg 




50 








55 






60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg Ala 


His 


Val 


Asp 


Met He Val 






65 








70 






75 


Met Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala Leu 


Tyr 


Val 


Gly 


Asp Met Cys 








80 






85 






90 


Gly Ala 


Val 


Met 


He 


Val 


Ser 


Gin Ala 


Phe 


lie 


He 


Ser Pro Glu 






95 








100 






105 


Arg His 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys Asn 


Cys 


Ser 


lie 


Tyr Gin Gly 






110 








115 






120 


Arg lie 


Thr Gly 


His 


Arg 


Met 


Ala Trp 


Asp 


Met 


Met 


Leu Asn Trp 






125 








130 






135 


Ser Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu Ala 


Tyr 


Ala 


Ala 


Arg Val Pro 






140 








145 






150 


Glu Leu 


Val 


Leu 


Glu 
155 


Val 


Val 


Phe Gly 


Gly 
160 


His 


Trp 


Gly Val Val 
165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met Gin 


Gly 


Ala 


Trp 


Ala Lys Val 






170 








175 






180 


He Ala 


He 


Leu 


Leu 
185 


Leu 


Val 


Ala Gly 


Val 
190 


Asp 


Ala 





(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 
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(xi) 



5 



10 



15 



Val Glu 


Val 


Arg 


Cys Ser 


Asn 


Asn 


His Leu 


Pro 


Gly 


Arg Cys 


Trp 


lie 


Gly Ala 


Leu 


Thr 


Met Ala 


Ala 


Thr 


Gly Ala 


Val 


Met: 


Arg His 


Asn 


Phe 


His lie 


Thr 


Gly 


Ser Pro 


Thr 


Leu 


Glu Leu 


Val 


Leu 


Phe Gly 


Leu 


Ala 


lie Ala 


lie 


Leu 



IEQU3 


3NCE 


DESCRIPTION 


Asn 


Thr 


Ser 


Phe 


Ser 


5 

Ser 


lie 


Thr 


Trp 


Gin 


20 








Cys 


Val 


Pro 


Cys 


Glu 


35 










Gin 


-it— T 

val 


Thr 


Pro 


Asn 


50 










HXS 


Asn 


Leu 


Arg 


Thr 


65 








Val 


Cys 


Ser 


Ala 


Leu 


80 








lie 


Ala 


Ser 


Gin 


Ala 


95 










Thr 


Gin 


Glu 


Cys 


Asn 


110 








His 


Arg 


Met 


Ala 


Trp 


110 










Thr 


Met 


He 


Leu 


Ala 


125 










Glu 


Val 


Val 


Phe 


Gly 


140 










Tyr 


Phe 


Ser 


Met 


Gin 


155 










Leu 


Leu 


Val 


Ala 


Gly 



170 



SEQ ID 


NO: 84: 


Tyr 


Tyr 


Ala 


Thr Asn Asp 


10 






15 


Leu 


Thr 


Asn 


Ala Val Leu 


25 






30 


Asn 


Asp 


Asn 


Gly Thr Leu 


40 






45 


Val 


Ala 


Val 


Lys His Arg 


55 






60 


His 


Val 


Asp 


Val He Val 


70 






75 


Tyr 


Val 


Gly 


Asp Val Cys 


85 






90 


Phe 


He 


lie 


Ser Pro Glu 


100 






105 


Cys 


Ser 


lie 


Tyr Gin Gly 


115 






120 


Asp 


Met 


Met 


Leu Asn Trp 


115 






120 


Tyr 


Ala 


Ala 


Arg Val Pro 


130 






135 


Gly 


His 


Trp 


Gly Val Val 


145 






150 


Gly 


Ala 


Trp 


Ala Lys Val 


160 






165 


Val 


Asp 


Ala 




175 









20 (2) INFORMATION FOR SEQ ID NO: 85: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID NO: 85: 




Val Glu Val 


Lys Asp 


Thr 


Gly Asp Ser 


Tyr Met Pro Thr Asn Asp 




5 






10 


15 


Cys Ser Asn 


Ser Ser 


lie 


Val Trp Gin 


Leu Glu Gly Ala Val 


Leu 


20 




25 


30 


His Thr Pro 


Gly Cys 


Val 


Pro Cys Glu 


Arg Thr Ala Asn Val 


Ser 




35 




40 


45 


Arg Cys Trp 


Val Pro 


Val 


Ala Pro Asn 


Leu Ala lie Ser Gin 


Pro 


50 






55 


60 


Gly Ala Leu 


Thr Lys 


Gly 


Leu Arg Ala 


His He Asp He He 


Val 


65 






70 


75 
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Met Ser 


Ala 


Thr 


Val 


Cys 


Ser 








80 






Gly Ala 


Leu 


Met 


Leu 


Ala 


Ala 






95 






His His 


Thr 


Phe 


Val 


Gin 


Glu 








110 






Arg lie 


Thr 


Gly 


His 


Arg 


Met 






125 






Ser Pro 


Thr 


Thr 


Thr 


Met 


Leu 








140 






Glu Val 


He 


Leu 


Asp 


lie 


Val 








155 






Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 






170 






lie Val 


lie 


Leu 


Leu 


Leu 


Thr 



10 185 



106 - 



Ala 


Leu 


Tyr 


Val 


Gly Asp Val Cys 






85 




90 


Gin 


Val 


Val 


Val 


Val Ser Pro Gin 






100 




105 


Cys 


Asn 


Cys 


Ser 


lie Tyr Pro Gly 




115 




120 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 




130 




135 


Leu 


Ala 


Tyr 


Leu 


Val Arg lie Pro 






145 




150 


Thr 


Gly 


Gly 


His 


Trp Gly Val Met 






160 




165 


Met 


Gin 


Gly 


Ser 


Trp Ala Lys Val 






175 




180 


Ala 


Gly 


Val 


Glu 


Ala 




190 







(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK12 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



25 



30 



35 



Leu Glu 


Trp 


Arg 


Asn 


Val 


Ser 


Gly 


Leu 


Tyr 


Val 


Leu 


Thr Asn Asp 




5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val He Leu 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn Thr Ser 






35 










40 






45 


Thr Cys 


Trp 


Thr 


Ser 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg Tyr Val 




50 










55 






60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His 


Val 


Asp 


Leu Leu Val 






65 










70 






75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 






80 










85 






90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg Pro Arg 






95 










100 






105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 






110 










115 






120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 








130 






135 


Ser Pro 


Ala 


Val 


Gly 
140 


Met 


Val 


Val 


Ala 


His 
145 


Val 


Leu 


Arg Leu Pro 
150 


Gin Thr 


Leu 


Phe 


Asp 
155 


lie 


lie 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly He Met 
165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 180 

Ala lie lie Met Val Met Phe Ser Gly Val Asp Ala 

185 190 



(2). INFORMATION FOR SEQ ID NO: 87: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK10 



15 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 87: 






Leu Glu 


Trp 


Arg 


Asn 


Val 


Ser 


Gly 


Leu 


Tyr 


val 


Leu 


Thr 


Asn 


Asp 








5 










10 










15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val 


lie 


Leu 








20 










25 










30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn 


Thr 


Ser 








35 










40 










45 


Thr Cys 


Trp 


Thr 


Ser 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr 


Val 






50 










55 










60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Leu 


Val 






65 










70 










75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Met 


Cys 








80 










85 










90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 






95 










100 










105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 








110 










115 










120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


val 


Ala 


His 


Val 


Leu 


Arg 


Leu 


Pro 








140 










145 










150 


Gin Thr 


Leu 


Phe 


Asp 


He 


lie 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


lie 


Leu 








155 










160 










165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 








170 










175 










180 


Ala lie 


lie 


Met 


Val 


Met 


Phe 


Ser 


Gly 


Val 


Asp 


Ala 














185 










190 













35 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 



WO 95/01442 
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10 



15 



20 



- 108 - 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 



(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO:88: 




Leu Glu 


Trp 


Arg 


Asn 


im 


Ser 


KsXy JUcU 


Tyr Val 


Leu Thr 


Asn Asp 




5 








10 




15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


vai 


Tyr Glu 


Ala Asp 


Asp Val 


He Leu 






20 








25 




30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys vai 


Gin Asp 


Gly Asn 


Thr Ser 






35 








40 




45 


Thr Cys 


Trp 


Thr 


Pro 


vai 


Thr 


Fro xnr 


Val Ala 


Val Arg 


Tyr Val 




50 








55 




60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg Ser 


His Val 


Asp Leu 


Leu Val 






65 








70 




75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala Leu 


Tyr Val 


Gly Asp 


Met Cys 






80 








85 




90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin Ala 


Phe Thr 


Phe Arg 


Pro Arg 






95 








100 




105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys Asn 


Cys Ser 


Leu Tyr Pro Gly 






110 








115 




120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala Trp 


Asp Met 


Met Met 


Asn Trp 






125 








130 




135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val Ala 


His Val 


Leu Arg 


Leu Pro 








140 








145 




150 


Gin Thr 


Val 


Phe 


Asp 


He 


lie 


Ala Gly 


Ala His 


Trp Gly 


He Leu 








155 








ISO 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met Gin 


Gly Asn 


Trp Ala Lys Val 






170 








175 




180 


Ala He 


He 


Met 


Val 


Met 


Phe 


Ser Gly 


Val Asp 


Ala 










185 








190 







(2) INFORMATION FOR SEQ ID NO: 89: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Asp Asp Val He Leu 

35 * 20 25 30 



WO 95/01412 
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-A 



10 



15 



30 













~ 


109 








His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin Asp Gly Asn 


Thr Ser 






35 










40 


45 


Met Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val Ala Val Arg 


Tyr Val 






50 










55 


60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


xie 


Arg 


Ser 


His Val Asp Leu 


Leu Val 






65 










70 


75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr Val Gly Asp 


Met Cys 






80 










85 


90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe Thr Phe Arg 


Pro Arg 






95 










100 


105 


Arg His 


Gin 


Thr 


vai 


Gin 


Thr 


Cys 


Asn 


cys ber jueu ry r 


Pro Gly 








*f 1 A 

110 










lib 


120 


His Val 


ber 


v^iy 


lllS 


Arg 


WSl 


Ala 


Trp 


Asp net wet. ncL 


Asn lrp 








125 










1JU 


IOC 

135 


Ser Pro 


Ala 


Val 


Gly 


Met 


vai 


val 


Ala 


His He Leu Arg 


Leu Pro 








140 










145 


150 


Gin Thr 


Leu 


Phe 


Asp 


He 


Leu 


Ala 


Gly 


Ala His Trp Gly 


lie Leu 








155 










160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly Asn Trp Ala 


Lys Val 






170 










175 


180 


Ala lie 


Val 


Met 


lie 


Met 


Phe 


Ser 


Gly 


Val Asp Ala 










185 










190 





(2) INFORMATION FOR SEQ ID NO: 90: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 



25 



35 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:90: 




Leu Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


He 


Leu 


Thr 


Asn Asp 






5 










10 








15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val 


He Leu 






20 










25 








30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn 


Thr Ser 






35 










40 








45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr Val 




50 










55 








60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Leu Val 






65 










70 








75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Met Cys 






80 










85 








90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro Arg 






95 








100 








105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro Gly 






110 










115 








120 
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5 



His 


Leu 


Ser 


vjXy HXS Arg 


wen 








125 




Ser 


Pro 


Ala 


Val Gly Met 


Val 








140 




Gin 


Thr 


Leu 


Phe Asp He 


Leu 








155 




Ala Gly Leu 


Ala Tyr Tyr 


Ser 








170 




Ala 


He 


He 


Met lie Met 


Phe 








185 





inn - 
XX U 










_ 

Ala Trp 


Asp 


rue u 


Met Met Asn Trp 




130 






135 


1 fa 1 Ala 
VcLX ilXa 


ilXS 


lie 


Leu Arg Leu 


Pro 




145 






150 


Ala Gly 


Ala 


His 


Trp Gly He 


Leu 




160 






165 


Met Gin 


Gly 


Asn 


Trp Ala Lys 


Val 




175 






180 


Ser Gly 


Val 


Asp 


Ala 






190 









(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO:91: 


Glu His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


He Tyr 


His 


He 


Thr Asn Asp 




5 








10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu Ala 


Asp 


His 


His He Leu 






20 








25 






30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Met 


Thr 


Gly 


Asn Thr Ser 






35 








40 






45 


Arg Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr Val 


Ala 


Val 


Ala His Pro 




50 








55 






60 


Gly Ala 


Pro 


Leu 


Glu 


Ser 


Phe 


Arg 


Arg His 


Val 


Asp 


Leu Met Val 






65 








70 






75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu Tyr 


Val 


Gly 


Asp Leu Cys 






80 








85 






90 


Gly Gly 


Ala 


Phe 


Leu 


Met 


Gly 


Gin 


Met He 


Thr 


Phe 


Arg Pro Arg 






95 








100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Glu 


Cys 


Asn Cys 


Ser 


He 


Tyr Thr Gly 




110 








115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met 


Met 


Met Asn Trp 






125 








130 






135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala Gin 


He 


Met 


Arg Val Pro 








140 








145 






150 


Thr Ala 


Phe 


Leu 


Asp 


Met 


Val 


Ala 


Gly Gly 


His 


Trp 


Gly Val Leu 








155 








160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 'Met 


Gin Gly 


Asn 


Trp 


Ala Lys Val 






170 








175 






180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Ala 





185 190 



35 
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10 



15 



20 



25 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : unknown 

(D) TOPOLOGY: unknown 

(vi) . ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 



Val His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val Tyr 


His 


Val Thr 


Asn 


Asp 








5 








10 








15 


Cys Pro 


Asn 


Thr 


Ser 


lie 


Val 


Tyr 


Glu Thr 


Glu 


His His 


lie 


Met 






20 








25 








30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Arg 


Thr 


Glu Asn 


Thr 


Ser 






35 








40 








45 


Arg Cys 


Trp 


Val 


Pro 


Leu 


Thr 


Pro 


Thr Val 


Ala 


Ala Pro 


Tyr 


Pro 




50 








55 








60 


Asn Ala 


Pro 


Leu 


Glu 


Ser 


Met 


Arg 


Arg His 


Val 


Asp Leu 


Met 


Val 








65 








70 








75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Phe Tyr 


He 


Gly Asp 


Leu 


Cys 






80 








85 








90 


Gly Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu Phe 


Asp 


Phe Arg 


Pro 


Arg 






95 








100 








105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn Cys 


Ser 


He Tyr 


Pro 


Gly 




110 








115 








120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met 


Met Met 


Asn 


Trp 






125 








130 








135 


Ser Pro 


Thr 


Ser 


Ala 


Leu 


lie 


Met 


Ala Gin 


He 


Leu Arg 


He 


Pro 








140 








145 








150 


Ser lie 


Leu 


Gly 


Asp 


Leu 


Leu 


Thr 


Gly Gly 


His 


Trp Gly 


Val 


Leu 






155 








160 








165 


Ala Gly 


Leu 


Ala 


Phe 


Phe 


Ser 


Met 


Gin Ser 


Asn Trp Ala 


Lys 


Val 






170 








175 








180 


lie Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly Val 


Glu Gly 












185 








190 











(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS : 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D ) TOPOLOGY : unknown 



(vi) 

35 



ORIGINAL SOURCE: 
(A) ORGANISM: homosapiens 
<C) INDIVIDUAL ISOLATE: Z6 
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10 



30 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:93: 




Val Asn 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


/loll 








5 










10 










Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Glu 


His 


Gin 


He Leu 






20 








25 








i n 
o u 


His Leu 


Pro 


Gly 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val 


Gly 


Asn 


CZ1 Tt Cor 








35 










40 










Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Ser 


Tvr Tl <a 






50 










55 








O \J 


Gly Ala 


Pro 


Leu 


Asp 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp 


Leu 








65 










70 








/ D 


Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 








80 










85 








y u 


Gly Gly 


Ala 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe 


Gin 


jet j. w y 






95 










100 










Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ala mv 






110 










115 








ion 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trt5 






125 










130 








135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


Val 


Met 


Arg 


lie Pro 








140 










145 








ISO 


Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp 


Gly 


Val Leu 








155 










160 








165 


Val Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn 


Trp 


Ala 


Lys Val 






170 










175 








180 


He Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 







15 



185 190 
(2) INFORMATION FOR SEQ ID NO: 94: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

" (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z7 



35 





(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO: 94: 






val 


Asn Tyr 


His Asn Ala 


Ser Gly Val 


Tyr His 


lie Thr 


Asn Asp 




5 




10 






15 


Cys 


Pro Asn 


Ser Ser lie 


Met Tyr Glu 


Ala Glu 


His His 


He 


Leu 




20 




25 






30 


His 


Leu Pro 


Gly Cys Val 


Pro Cys Val 


Arg Glu 


Gly Asn 


Gin 


Ser 






35 




40 






45 


Arg 


Cys Trp 


Val Ala Leu 


Thr Pro Thr 


Val Ala 


Ala Pro 


Tyr 


lie 


50 




55 






60 


Gly Ala Pro 


Leu Glu Ser 


lie Arg Arg 


His Val 


Asp Leu 


Met 


Val 






65 




70 






75 
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15 



25 



30 



Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


He Gly Asp 


Leu Cys 








80 










85 




90 


Gly Gly 


vai 


PHe 


Leu 


val 


Gly 


Gin 


Met 


Phe 


Ser Phe Gin 


Pro Arg 








95 










100 




105 


Arg His 


Trp 


inr 


Thr 




ASp 


Cys 


as n 


Cys 


Ser He Tyr 


Ala Sly 








110 










115 




120 


nlS V3J. 


inr 


«xy 


nlS 


Arg 


Met 


Ai.a 


Trp 


ASp 


Met Met Met 


Asn Trp 








IOC 










130 




135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Val 


Leu 


Ala 


Gin 


Val Met Arg 


lie Pro 








140 










145 


150 


Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 


Thr 


Gly 


Gly 


His Trp Gly 


He Leu 








155 










160 


165 


He Gly 


Val 


Ala 


Tyr 


Phe 


Cys 


Met 


Gin 


Ala 


Asn Trp Ala 


Lys Val 








170 










175 




180 


lie Leu 


Val 


Leu 


Phe 


Leu 


Tyr 


Ala 


Gly 


Val 


Asp Ala 










185 










190 





10 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
{D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



35 



Tyr Asn 


Tyr 


Arg 


Asn 
5 


Ser 


Ser 


Gly 


Val 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Pro 


Asn 


Ser 


Ser 
20 


lie 


Val 


Tyr 


Glu 


Thr 
25 


Asp 


Tyr 


His He Leu 
30 


His Leu 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly 


Asn Lys Ser 
45 


Thr Cys 


Trp 


Val 


Ser 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Ala 


Gin His Leu 




50 










55 






60 


Asn Ala 


Pro 


Leu 


Glu 
65 


Ser 


Leu 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Met Val 
75 


Gly Gly 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


He 


Gly 


Asp Val Cys 
90 


Gly Gly 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Gin Pro Arg 
105 


Arg His 


Trp 


Thr 


Thr 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Thr Gly 
120 


His He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Ala 


Thr 
140 


Leu 


Val 


Leu 


Ala 


Gin 
145 


Leu 


Met 


Arg He Pro 
150 


Gly Ala 


Met 


Val 


Asp 
155 


Leu 


Leu 


Ala 


Gly 


Gly 
160 


His 


Trp 


Gly He Leu 
165 
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Val Gly lie Ala Tyr Phe Ser Met Gin Ala Asn Trp Ala Lys Val 
170 175 180 

lie Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) ^ LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: SA1 



(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO:96: 


Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val Thr Asn Asp 




5 










10 




15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Ser Leu lie Leu 






20 










25 




30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asp Asn Val Ser 






35 










40 




45 


Arg Cys 


Trp 


Val 


Gin 


lie 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala Pro Thr Phe 




50 










55 




60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp Tyr Leu Ala 






65 










70 




75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Ala Cys 






80 










85 




90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr Arg Pro Arg 






95 








100 




105 


Gin His 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr Ser Gly 








110 










115 




120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 






125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Leu 


Met 


Ala 


Gin 


Met 


Leu Arg He Pro 








140 










145 




150 


Gin Val 


Val 


He 


Asp 


He 


He 


Ala 


Gly 


Gly 


His 


Trp Gly Val Leu 








155 










160 




165 


Phe Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp Ala Lys Val 








170 










175 




180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 



185 190 

30 

(2) INFORMATION FOR SEQ ID NO: 97: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 



(i) 



35 
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(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
<C) INDIVIDUAL ISOLATE: SA4 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 






5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu He Leu 






20 










25 






30 


His Ala 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Gin 


Asp 


Asn Val Ser 






35 










40 






45 


Lys Cys 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro Asn Leu 




50 










55 






60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 






65 










70 






75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Ala Cys 






80 










85 






90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg Pro Arg 






95 










100 






105 


Gin His 


Thr 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Ser Gly 
120 


His lie 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Leu 


Met 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Val 


Val 


He 


Asp 
155 


lie 


lie 


Ala 


Gly 


Gly 
160 


His 


Trp 


Gly Val Leu 
165 


Phe Ala 


Ala 


Ala 


Tyr 
170 


Phe 


Ala 


Ser 


Ala 


Ala 
175 


Asn 


Trp 


Ala Lys Val 
180 


lie Leu 


Val 


Leu 


Phe 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Ala 





(2) INFORMATION FOR SEQ ID NO : 9 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
25 (B) TYPE: amino acid 

( C ) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAB 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu lie Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Lys Glu Gly Asn Val Ser 
35 35 40 45 



WO 95/0140 
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Arg Cys 


Trp 


Val 


Gin 






50 


Gly Ala 


Val 


Thr 


Ala 






65 


Gly Gly 


Ala 


Ala 


Leu 






80 


Gly Ala 


Val 


Phe 


Leu 






95 


Gin His 


Thr 


Thr 


Val 








110 


His lie 


Thr 


Gly 


His 








125 


Ser Pro 


Thr 


Thr 


Ala 








140 


Gin Val 


Val 


lie 


Asp 








155 


Phe Ala 


Val 


Ala 


Tyr 








170 


Val Leu 


Val 


Leu 


Phe 








185 



- 116 - 



lie 


Thr 


Pro 


Thr 


Leu 
55 


Pro 


Leu 


Arg 


Arg 


Val 
70 


Cys 


Ser 


Ala 


Leu 


Tyr 








85 


Val 


Gly 


Gin 


Met 


Phe 








100 


Gin' 


Asp 


Cys 


Asn 


Cys 
115 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Leu 


Val 


Met 


Ala 


Gin 
145 


lie 


lie 


Ala 


Gly 


Gly 
160 


Phe 


Ala 


Ser 


Ala 


Ala 
175 


Leu 


Phe 


Ala 


Gly 


Val 








190 



Ser 


AJ_a Fro ash jjcu 




60 


Val 


Asp Tyr Leu Ala 




75 


Val 


Gly Asp Ala Cys 




90 


Thr 


Tyr Arg Pro Arg 




105 


Ser 


lie Tyr Ser Gly 




120 


Met 


IWT^t* Uff*+- Kevi ^Pt"t"\ 

JXlec net Jfcsn iip 




135 


Val 


Leu Arg lie Pro 




150 


His 


Trp Gly Val Leu 




165 


Asn 


Trp Ala Lys Val 




180 


Asp 


Gly 



15 (2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY : unknown 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 




5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Leu He Leu 






20 










25 






30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Lys 


Asp 


Asn Val Ser 






35 










40 






45 


Arg Cys 


Trp 


Val 


His 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro Ser Leu 




50 










55 






60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 






65 










70 






75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 






80 










85 






90 


Gly Ala 


Leu 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg Pro Arg 






95 










100 






105 


Gin His 


Ala 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ser Gly 








110 








115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 
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Ser Pro Ala Thr Ala Leu Val Met Ala Gin Met Leu Arg lie Pro 

140 145 150 

Gin Val Val lie Asp He He Ala Gly Gly His Trp Gly Val Leu 

155 160 165 

Phe Ala Ala Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 

170 175 180 

Val Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA7 



15 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO: 100: 


Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu He Leu 






20 










25 






30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asn 


Asn Val Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Gin 


lie 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro Asn Leu 




50 










55 






60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 








65 










70 






75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Ala Cys 








80 










85 






90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Tyr 


Arg Pro Arg 






95 










100 






105 


Gin His 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ser Gly 








110 










115 






120 


His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg He Pro 








140 










145 






150 


Gin Val 


Val 


lie 


Asp 


lie 


He 


Ala 


Gly 


Gly 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Phe Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 










185 










190 









35 (2) INFORMATION FOR SEQ ID NO: 101: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

1 Thr Asn Asp 
15 

p Leu lie Leu 
30 

y Asn Val Ser 
45 

a Pro Ser Leu 
60 

p Tyr Leu Ala 
75 

y Asp Ala Cys 

15 ~ ~ fln 85 90 

x Ser Pro Arg 
105 

.e Tyr Ser Gly 
120 

it Met Asn Trp 
135 

iu Arg lie Pro 
150 

:p Gly Val Leu 
165 

rp Ala Lys Val 
180 



20 



25 



Val Pro Tyr 


Arg 


Asn 


Ala 


Ser 


Gly Val 


Tyr His 


5 








10 


Cys Pro Asn 


Ser 


Ser 


He 


Val 


Tyr Glu 


Ala Asp 




20 








25 


His Ala Pro 


Gly 


Cys 


Val 


Pro 


Cys Val 


Arg Gin 




35 








40 


Arg Cys Trp 


Val 


Gin 


lie 


Thr 


Pro Thr 


Leu Ser 




50 








55 


Gly Ala Val 


Thr 


Ala 


Pro 


Leu 


Arg Arg 


Ala Val 




65 








70 


Gly Gly Ala 


Ala 


Leu 


Cys 


Ser 


Ala Leu 


Tyr Val 




80 








85 


Gly Ala Val 


Phe 


Leu 


Val 


Gly 


Gin Met 


Phe Thr 




95 








100 


Arg His Asn 


Val 


Val 


Gin 


Asp 


Cys Asn 


Cys Ser 




110 








115 


His lie Thr 


Gly 


His 


Arg 


Met 


Ala Trp 


Asp Met 




125 








130 


Ser Pro Thr 


Thr 


Ala 


Leu 


Val 


Met Ala 


Gin Leu 




140 








145 


Gin Val Val 


He 


Asp 
155 


lie 


lie 


Ala Gly 


Ala His 
160 


Phe Ala Ala 


Ala 


Tyr 
170 


Tyr 


Ala 


Ser Ala 


Ala Asn 
175 


Val Leu Val 


Leu 


Phe 
185 


Leu 


Phe 


Ala Gly 


Val Asp 
190 



(2) INFORMATION FOR SEQ ID NO: 102; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
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10 



25 



Leu 


Tlxr 


Tyr 


Gin 


Asn 


Ser 


Ser 


Gin 


Leu 


Tyr 


His 


Leu 


Thr 


Asn Asp 








1 










ill 










1j 


Cys 


Pro 


Asn 


Ser 


Ser 


lie 


Vdl 


Leu 




7A1 3 


Asp Ala 


Mot 


-LJLe 


Leu 








20 




















30 


His 


Leu 


Pro 


Gin 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val Asp 


Asp 


Arg 


Ser 










35 










AH 












Thr Cys 


Trp 


tHLS 


TV 1 a 


V CLJL 


Thr 

LLLL 


£ JL w 




JJCU 


Aia 


xie 


Pro 


7V on 
adU 


AJLcL 








C (\ 

bu 




















fin 


Ser 


Thr 


Pro 


A±SL 


rnt« 

lnr 

DO 


Gin 


irlltr 




7AT"fT 

Airg 


/ VJ 


val 


ASp 


Leu 


Leu 


aJLcL 

75 


Gin 


Ala 


Ala 


Val 


Val 


Cys 


Ser 


Ser 


Leu 


Tyr 


xie 


OrXn 


Asp 


Leu 


wys 










80 


















on 


Gin 


Ser 


Leu 


Pne 


Leu 
95 




vs±n 






i nn 
xuu 


Thr 


Phe 


Gin 


pro 


Arg 
105 


Arg 


His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr 


rn"U >- 

ixir 








110 




















X A u 


His 


Val 


xnr 




XlXS 








irp 


ASM 


Met 


Met 


Met 


Asn 


Trp 










125 








130 










135 


Ser 


Pro 


Thr 


Thr 


Thr 
140 


Leu 


Val 


Leu 


Ser 


Ser 
145 


lie 


Leu 


Arg 


Val 


Pro 
150 


Glu 


lie 


Cys 


Ala 


Ser 


Val 


lie 


Phe 


Gin 


Gin 


His 


Trp 


Gin 


He 


Leu 








155 










160 










165 


Leu 


Ala 


Val 


Ala 


Tyr 
170 


Phe 


Gin 


Met 


Ala 


Gin 
175 


Asn 


Trp 


Leu 


Lys 


Val 
180 


Leu 


Ala 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gin 


Val 


Glu 


Ala 









15 

Leu Ala Val Leu 

185 190 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

GCGTCCGGGT TCTGGAAGAC GGCGTGAACT ATGCAACAGG 40 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

AGGCTTTCAT TGCAGTTCAA GGCCGTGCTA TTGATGTGCC 40 

35 (2) INFORMATION FOR SEQ ID NO: 105: 
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° (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

5 AAGACGGCGT GAACTATGCA ACAGGGAACC TTCCTGGTTG 40 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 
10 (B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

AGTTCAAGGC CGTGCTATTG ATGTGCCAAC TGCCGTTGGT 40 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 107: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

AAGACGGCGT GAATTCTGCA ACAGGGAACC TTCCTGGTTG 40 



(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

AGTTCAAGGC CGTGGAATTC ATGTGCCAAC TGCCGTTGGT 40 



35 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 
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° (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ARCTYCGACG TYACATCGAY CTGCTYGTYG GRAGYGCCAC CC 42 

5 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

RCARGCCRTC TTGGAYATGA TCGCTGGWGC Y 31 



15 (2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

CRATACGACR YCAYGTCGAY TTGCTCGTTG GGGCGGCTRY YT 42 



(2) INFORMATION FOR SEQ ID NO: 112: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

30 RCAAGCTRTC RTGGAYRTGG TRRCRGGRGC C 31 



25 



(i) 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

TTGCGGACKC ACATYGACAT GGTYGTGATG TCCGCCACGC 40 



5 (2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

GATGCGCGTT CCCGAGGTCA TCWTAGACAT CRTYRGCGGR GCD 43 



(2) INFORMATION FOR SEQ ID NO: 115: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

20 AATGGCACCY TGCRCTGCTG GATACAAGTR ACACCTAATG TGGCTGTGAA 50 
ACAC 54 



15 



(i) 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 
25 (B) TYPE: nucleic acid 

<C) STRANDEDNESS : single 
(D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

ARCTAGYC CTYSARGTYG TCTTCGGYGG Y 31 

30 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

GCCAACGTCT CTCGATGTTG GGTGCCGGTT GCCCCCAATC TCGCCATAAG 50 
TCAA 54 

(2). INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: J 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : S ingle 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

AAGGGCCTGC GAGCACACAT CGATATCATC GTGATGTCTG CTACGG 46 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 
1* (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

TTGGTGCGCA TCCCGGAAGT CATCTTGGAT ATTGTTACAG GAGGT 45 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 120: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

AGTCAGGTAY GTCGGAGCAA CCACCGCYTC GATACGCAGT 40 

30 (2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
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AGCCTTCACG TTCAGACCKC GTCGCCATCA AACRGTCCAG ACCTGT 46 

(2) INFORMATION FOR SEQ ID NO: 122: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

TCCCCCGCYG TGGGTATGGT GGTRGCGCAC RTYCTGCGDY TGCCCCAGAC 50 
CKTGTTYGAC ATAMTRGCYG GGGCC 75 

(2) INFORMATION FOR SEQ ID NO: 123: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (d) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 3: 

ACGCCGGTGA CGCCTACAGT GGCTGTCGCA CACCCGGGC 39 

20 (2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

25 {xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

ATGAGGGTCC CCACAGCCTT TCTCGACATG GTTGCCGGAG GC 42 

(2) INFORMATION FOR SEQ ID NO: 125: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

35 CGCGCCCTAT CCCAACGCAC CGTTAGAGTC CATGCGCAGG 40 
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(2) INFORMATION FOR SEQ ID NO: 12 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 
<B) TYPE: nucleic acid 
( C ) STRANDEDNESS : s ingl e 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

TCAGATCTTA CGGATCCCCT CTATCCTAGG TGACTTGCTC ACCGGGGGT 49 



(2) INFORMATION FOR SEQ ID NO: 12 7: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

CAGTCACGCT GCTGGGTGGC CCTTACTCCC ACCGTGGCGG YGYCTTATAT 50 
CGGT 54 



(2) INFORMATION FOR SEQ ID NO: 12 8: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

25 TAGCACTCTG GTRGAYCTAC TCRCTGGAGG G 31 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

AAGTCTACAT GCTGGGTGTC TCTCACCCCC ACCGTGGCTG CGCAACATCT 50 
35 GAAT 54 
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(2) INFORMATION FOR SBQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

AGGCGCCATG GTCGACCTGC TTGCAGGCGG C 



(2) INFORMATION FOR SEQ ID NO: 131: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

15 TCAGCCCCGA VYYTCGGAGC GGTCACGGCT CCTCTTCGGA GGG 



(2) INFORMATION FOR SEQ ID NO: 132: 

SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGYTACGGAT YCCCCARGTG GTCATHGACA TCATWGCCGG GGSC 

25 

(2) INFORMATION FOR SEQ ID NO: 133: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : S ingle 
{ D ) TO POLOG Y : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

CATACCAAAT GCTTCCACGC CCGCAACGGG ATTCCGCAGG 



(i) 



20 



(i) 



30 



(2) INFORMATION FOR SEQ ID NO: 134: 
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° (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

5 TCTTCTTGCG GGCGCCGCAG TGGTTTGCTC ATCCCTG 37 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 



15 



ATCTAGCATC TTGAGGGTAC CTGAGATTTG TGCGAGTGTG ATATTTGGTG 50 
GC 52 



(2) INFORMATION FOR SEQ ID NO: 13 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 6: 

Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala 

5 10 15 

Leu Thr His Asn Leu Arg Xaa His Xaa Asp Xaa lie Val Met Ala 
25 20 25 30 

Ala Thr Val 

(2) INFORMATION FOR SEQ ID NO: 13 7: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 



35 



Trp Val Pro Val Ala Pro Asn Leu Ala lie Ser Gin Pro Gly Ala 

5 10 15 
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Leu Thr Lys Gly Leu Arg Ala His lie Asp lie lie Val Met Ser 

20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 13 8: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

10 Trp lie Pro Val Xaa Pro Asn Val Ala Val Xaa Xaa Pro Gly Ala 

5 10 15 

Leu Thr Gin Gly Leu Arg Thr His lie Asp Met Val Val Met Ser 

20 25 30 

Ala Thr Leu 



15 (2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Trp Thr Xaa Val Thr Pro Thr Val Ala Val Arg Tyr Val Gly Ala 

5 10 15 

Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Thr Xaa 



25 



(2) INFORMATION FOR SEQ ID NO: 140; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
30 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Trp Val Ala Leu Xaa Pro Thr Leu Ala Ala Arg Asn Xaa Xaa Xaa 

5 10 15 

Xaa Thr Xaa Xaa lie Arg Xaa His Val Asp Leu Leu Val Gly Ala 
- 20 25 30 

^ Ala Xaa Phe 
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o 

(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 
j (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Trp Val Xaa Xaa Xaa Pro Thr Val Ala Thr Arg Asp Gly Lys Leu 

5 10 15 

Pro Xaa Xaa Gin Leu Arg Arg Xaa He Asp Leu Leu Val Gly Ser 

20 25 30 

10 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 
15 (D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro Gly Ala 

5 10 15 

Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala 

20 20 25 30 
Ala Thr Leu 



(2) INFORMATION FOR SEQ ID NO: 143: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Trp Val Ala Leu Thr Pro Thr Val Ala Xaa Xaa Tyr He Gly Ala 

30 5 10 15 

Pro Leu Xaa Ser Xaa Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 



(i) 



25 
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(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu Asn Ala 

5 10 15 

Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val Gly Gly 

20 25 30 

Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Pro Asn Ala 

5 10 15 

Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Met 

(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Trp Val Xaa lie Thr Pro Thr Leu Ser Ala Pro Xaa Xaa Gly Ala 

5 10 15 

Val Thr Ala Pro Leu Arg Arg Xaa Val Asp Tyr Leu Ala Gly Gly 

20 25 30 

30 Ala Ala Leu 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
35 (□) TOPOLOGY: unknown 



20 



25 
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° (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Trp His Ala Val Thr Pro Thr Leu Ala lie Pro Asn Ala Ser Thr 

5 10 15 

Pro Ala Thr Gly Phe Arg Arg His Val Asp Leu Leu Ala Gly Ala 

20 25 30 

Ala Val val 

5 

(2) INFORMATION FOR SEQ ID NO: 14 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 8: 

Thr Leu Thr Met lie Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu 

5 10 15 

Xaa Leu Xaa Val Val Phe Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 149: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Thr Thr Thr Met Leu Leu Ala Tyr Leu Val Arg lie Pro Glu Val 

5 10 15 

lie Leu Asp He Val Thr Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 150: 



25 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Thr Xaa Thr Xaa lie Leu Ala Tyr Xaa Met Arg Val Pro Glu Val 

5 10 15 

lie Xaa Asp He Xaa Xaa Gly Ala 
35 20 
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(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Ala Val Gly Met Val Val Ala His Xaa Leu Arg Leu Pro Gin Thr 

5 10 15 

Xaa Phe Asp lie Xaa Ala Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Thr Xaa Ala Leu Val Xaa Ser Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Xaa Asp Xaa Val Xaa Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 153: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
25 (D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Thr Xaa Ala Leu Val Xaa Ala Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Leu Asp Met He Ala Gly Ala 
30 20 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Thr Thr Thr Leu Leu Leu Ala Gin lie Met Arg Val Pro Thr Ala 

5 10 15 

Phe Leu Asp Met Val Ala Gly Gly 

20 



(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Thr Thr Thr Leu Xaa Leu Ala Gin Val Met Arg lie Pro Ser Thr 

5 10 15 

Leu Val Asp Leu Leu Xaa Gly Gly 

20 



15 



(2) INFORMATION FOR SEQ ID NO: 156: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 6: 

Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg lie Pro Gly Ala 

5 10 15 

Met Val Asp Leu Leu Ala Gly Gly 
25 20 



(2) INFORMATION FOR SEQ ID NO: 15 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Thr Ser Ala Leu He Met Ala Gin He Leu Arg lie Pro Ser He 

5 10 15 

35 
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Leu Gly Asp Leu Leu Thr Gly Gly 

20 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
3 (B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Xaa Thr Ala Leu Xaa Met Ala Gin Xaa Leu Arg He Pro Gin Val 
10 5 10 15 

Val He Asp He He Ala Gly Xaa 

20 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
I 5 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Thr Thr Thr Leu Val Leu Ser Ser He Leu Arg Val Pro Glu He 
20 5 10 15 

Cys Ala Ser Val He Phe Gly Gly 

20 



25 



30 



35 
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CLAIMS 

1. A cDNA of the envelope 1 gene of the 
hepatitis C virus wherein the cDNA has a sequence selected 
from the group consisting of SEQ ID NO: 1 through SEQ ID 
N0:51. 

2. A recombinant hepatitis C virus envelope 1 
protein encoded by a gene whose sequence includes a. 
sequence selected from the group consisting of SEQ ID NO:l 
through SEQ ID NO: 51. 

3 . A recombinant protein having an amino acid 
sequence selected from the group consisting of SEQ ID NO: 52 
through SEQ ID NO: 102. 

15 4. A method for the recombinant DNA- directed 

synthesis of at least one complete envelope 1 protein of 
hepatitis C virus said method comprising: 

culturing a transformed or transf ected host 
organism containing a DNA sequence capable 

20 of directing the host organism to produce an 

envelope 1 protein under conditions such 
that the protein is produced, said protein 
exhibiting substantial homology to a protein 
comprising the amino acid sequence selected 

25 from the group consisting of SEQ ID NO: 52 

through SEQ ID NO: 102. 

5. The method of claim 4, wherein the host 
organism is transfected with a recombinant eukaryotic 
expression vector. 

6. The method of claim 4, wherein the 
eukaryotic vector is a baculovirus vector. 



30 



35 



7. The method of claim 4, wherein the host 
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° organism is a eukaryotic cell, 

8. The method of claim 7, wherein the 
eukaryotic cell is an insect cell. 

5 9 . A recombinant expression vector comprising a 

cDNA sequence selected from the group consisting of SEQ ID 
NO:l through SEQ ID NO: 51. 

10. A host organism transformed or transfected 
10 with a recombinant expression vector according to claim 9. 

11. A method of detecting antibodies to HCV in a 
biological sample suspected of containing said antibodies 
comprising: 

(a) contacting the sample with at least one 
recombinant protein of claim 3 to form 
an immune complex with the antibodies; 
and 

(b) detecting the presence of the immune 
complex. 

12. The method of claim 11 wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

25 

13. The method of claim 11, wherein the 
recombinant envelope 1 protein is bound to a solid support. 

14. The method of claim 11, wherein the immune 
30 complex is detected using a labeled antibody. 

15. A hepatitis C virus hit comprising: at least 
one recombinant protein comprising am amino acid sequence 
selected from the group consisting of: SEQ ID NO: 52 through 

35 SEQ ID NO: 102. 



15 



20 
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° 16. A pharmaceutical composition comprising at 

least one recombinant protein of claim 3 and a suitable 
excipient, diluent or carrier. 

17. A method of preventing hepatitis C 

5 infection, comprising administering the pharmaceutical 

composition of claim 16 to a mammal in an effective amount 
to stimulate the production of protective antibody, 

18. A vaccine for immunizing a mammal against 
10 hepatitis C infection, comprising at least one recombinant 

protein according to claim 3 in a pharmacologically 
acceptable carrier. 

19 . A method for detecting the presence of the 
15 hepatitis C virus via a reverse transcription-polymerase 

chain reaction process, wherein the primers are selected 
from the sequences shown in SEQ ID NO: 103 through in SEQ ID 
NO: 108. 

20 20. Substantially isolated and purified primers, 

wherein said primers have nucleic acid sequences selected 
from the group consisting of SEQ ID NO: 103 through SEQ ID 
NO:108. 

25 21. A diagnostic kit for use in detecting the 

presence of hepatitis C virus, said kit comprising: primers 
having nucleic acid sequences selected from the group 
consisting of SEQ ID NO: 103 through SEQ ID NO: 108. 

30 22. A method for determining the genotype of a 

hepatitis C virus, said method comprising: 

(a) amplifying RNA via reverse 

transcription-polymerase chain reaction 
to produce amplif ication products; 
35 (b) contacting said products with at least 
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° one genotype -specific oligonucleotide; 

and 

(c) detecting complexes of said products 
which bind to said oligonucleotide (s) . 

5 23. The method of claim 22, wherein said 

anplif ication of step (a) uses primer having a sequence 
according to SEQ ID NO: 103 through SEQ ID NO: 10 8. 

24. The method of claim 23, wherein said 

10 oligonucleotide of the step (b) is a nucleic acid sequence 
selected from the group consisting of SEQ ID NO: 109 through 
SEQ ID NO: 135. 

25. Substantially isolated and purified 
15 oligonucleotides, wherein said oligonucleotides have 

nucleic acid sequences selected from the group consisting 
of SEQ ID NO: 109 through SEQ ID NO: 135. 

26. A diagnostic kit for determining the 
20 genotype of a hepatitis C virus, said kit comprising 

primers selected from the group consisting of SEQ ID NO: 103 
through SEQ ID NO: 108 and hybridization probes selected 
from the group consisting of SEQ ID NO: 109 through SEQ ID 
NO: 135. 

25 

27. A substantially purified and isolated 
peptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 13 6 through SEQ ID NO: 159. 

30 28. A method of detecting antibodies specific 

for a single genotype of HCV, said method comprising: 

(a) . contacting a biological sample with at 
least one peptide of claim 27 to form 
an immune complex with the antibodies, 
35 and 
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° (b) detecting the presence of the immune 

complex. 

29. The method of claim 28, wherein the 
biological sample is selected from the group consisting of 

5 serum, saliva or lymphocytes or other mononuclear cells. 

30. The method of claim 28, wherein said peptide 
is bound to a solid support. 

10 31. The method of claim 28, wherein the immune 

complex is detected using a labelled antibody. 

32. A kit for use in detecting hepatitis C virus 
antibodies, said kit comprising: at least one peptide 

15 selected from the group consisting of SEQ ID NO: 136 through 
SEQ ID NO: 159. 

33 . A pharmaceutical composition comprising at 
least one peptide of claim 27 and a suitable excipient, 

20 diluent or carrier. 

34. A method of preventing hepatitis C 
infection, comprising administering the pharmaceutical 
composition of claim 33 to a mammal in an effective amount 

25 to stimulate production of a protective antibody. 

35. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one peptide 
according to claim 27 in a pharrnaceutically acceptable 

30 carrier . 



35 
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1 TACCAASTGCGCAACTCCJlCGGGGC^ 

IMIlll MINIM M 1 1 1 1 1 1 1 1 1 M 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 It 1 1 1 1 1 MIIMI 

1 CACCAAGTCCGCWlCTCrACAGGGCrrTAC 



1 CACCAAGTGCGCAACTCTACAGGGCT^ 

MIMM 1 1 1 1 1 1 1 1 i II II IIIIIIIIIIMIIIIIIIIIIMII IIIIMI 
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SSO ID NO: 
5 

1 

8 

4 

3 
2 
6 
7 

1-8 
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S14 

DK7 

US11 

DR4 

DR1 

DK9 

SI 8 

SWI 

consensus 



62 TtGTGTACGAGaCaGCtGATGCtATC 
62 TcCTGraCGAGGCTC 



62 TTGTGTACGAGGCGGCCGATGCCATCCTGCACACGCl 



62 "nxsTCTAO^ 

SGGGTGTGTCCCTTGCGTTCGCGA 
CCTTGCGTTCGCGA 

62 iTCTOT^GATCa^CGM^ 



62 lTG'PG Ta CGAgGCGGCCC^TOCCaTCCTGCROgCGCC 



62 I TOTO t J ^^ 

62 TTGTCiIcGAgI^^ 

TtGTGTACGAGgCgGCcGATgCcATcCTgCAc - CtCCgGGgTGTGTcCCTTGCGTTCGcGA 



SEO ID NO: 
5 


Isolate 
S14 


123 


1 


DK7 


123 


8 


US11 


123 


4 


DR4 


123 


3 


DR1 


123 


2 


DK9 


123 
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S18 


123 


7 


SWI 


23 


1-8 
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123 CK3GTJU^acCTCGJW3GTCT 

GGGTAACG1 
llllllll 



123 gggtaacgcctcgaaatgttgggtggcggtggcccccacgg 
<^ctaj=J:gc 

III Mil III II IMIlll I III Mill I Mil II Mill II II III MM I 

TCaTggCGCCcCGAagTGTITGGGTGgCGGTGGCCCCCACACT 

GGgTaaCgcctCGAggTCTTGGGTGgCGgTGaCCCCCACgGTgGCCAC cAGGGAcGGCAAa 



WO 95/01442 



2/47 



PCT/US94/07320 



FIGURE 1A 



SEQ ID WOa ISfil|£| CTCCCCgCAaCGCAGCTTCGACGTtJlCATCGATCTGCTtGTC 

S14 CTCC ^ | | | j j j j j | j | | | | | | | ||| | |||| ! ]_[ Ill I i iiLiiii I I I I I I I I i 



5 

! DK7 184 



OTCCCCACAgCGCAGCTTCGACGTCACATCGATCTC 



8 



4 



nsil 184 CTCCCCACAACGCAaCTTCGACGTCA^ 

I! II Ml II II I! I II II 1 1 II III 1 1 Ml MM 1 1 11 M HIM 11 1 Ml Mlli I 

DR4 184 CTCCrCACAACGCAGCTc^^ 



3 DRI 184 CACAACG ^ 

2 DK9 184 CTCCKa*^^^ 



C S18 184 ttttCGCJW&CXGC^ 

Hill 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 IN 1 1 

7 SWl 184 CTCCCtGCAACGCAGCITCGACGTCACATCG 

1-8 consensus CTCCCc- CAaCGCAgCTtCGACGTcACATCGAtC^ 

SEP ID_HO: Isolate 
5 S14 

1 DK7 

8 Sll 




245 C^CCCTCIMGTG« 

245 CGGCCCTCT^^ 
fi S18 245 ^(XCTC^^ 

I II INI III II I II II HUM I II 1 1 MM III II I I II I Ml IINIIII I 

SWl 245 CKCCCTCXAcGTGGGGGACtTC^ 

CGGCCCTCBlcGTGGGGGAC - TGTGCGGGTCTGTCTTt CTtGTCgGtCAaCTGTTcACctT 



4 DR4 
3 BR1 
2 DK9 



7 
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1 DK7 306 ^^^^ 

8 Sll 306 ^^^C^ 

4 DR4 306 CTCTCCC^^ 

3 DR1 306 tOTti^^ 

1 1 Mill || Mill MMMIM III IMMII I M 1 1 1 1 1 M I MIMMIMI 

2 DK9 306 CTCCCCCMa^CAC^ 

6 S18 

7 SWl 
1*8 consensus 



306 CTXXCC^^ 



i I I i I I I i i I I i I i I I I I I I I I I I I I I I I I I t i i i i i i i i Ill 

306 CTCCCCCAGGCGCCACTGGACAACGCAAGACTC 

cTCtCCCAGgCgCCaCTGGACaJUTGC^^ 
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428 WOCgCAflCIGCTCA^^ 



428 ^S^ 1 ^^ 
428 TWXOTMOT 

TaGCtCAGCTGCTCcGGaTCCC - CAaGCCaTCTTGGAcATGATCGCTGG tGCc^ACTGGGG 



489 JUyrCCTaGOGGGCAXAGCGTATCT 

489 J^icra&lcira 



489 



489 J^rccraGa^^ 

489 AGTCCrJuSCGGGCftTAG^ 
489 ASTCCERGCGGGC^ 



AGTCCTAGTOGGCOT 



489 AGTCCTW^ 



489 ACTCCXAGCGGGCA^^ 

AGTCCTaGCGGGCATMCGTATITcTCCATGG tGGGgAACTGGGCGAAGGTC cTg gTaGTg 
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FIGURE IB 
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123 GaaCAACcaCTCCCGtTGCTGGGTAGCGCTCACcCCCACGCTCGCGGCCAGGAACgCCAGC 



123 GGgCAACTCCTCCCGCTGCTGGGTAGCGCTCM 




GAACAACTCCTCCCCTTGcTGGCT^ 
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GCAACCACCGCcTCGATACGOUoTCATGTGGACC^ 

iimmm imiiiiiiiiiiiiiii inn ii mimmmmi ',''!'!#' 

GCAACCACCGCTTCGATACGCAGTCJVTGTGG^ 

nnniiiiii mi in ii n ii m ii in JiniiiiiMiM iimi 

GCAACCACCGCITCGATACGCAGTCATGTGGACCT^^ 

iimmiMi^ 

GCAACCACCGCTTCGAT^GCAGTCATGTGGACCTATT^ 
GCAACCACCGCtTOGATRCGCAGTCATGTGGACCTatTaGTGTC 
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245 CTGCGCTCTACGTGGGtGATgTGTGTGGGGCC^ 

MMIMMIMM!! II! MMIMIIMMMIMI ! ! ! i 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 

245 CTGCGCTCXACGTGGGcGATATGTGTGGGGCCGTCTTCCTCGTG 

MMiiiiiiiiiiit 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 NJLLL ' 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

245 CTGCGCTC13^GTGGGTGATATOTGTGGGGCCGTCITrCTCGT^ 

MM! 1 1 1 1 1 I i 1 1 1 1 1 1 1 1 Jltlll 1 1 1 1 1 1 1 1 1 1 1 ! I M 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 

245 CTGCGCTCTATGTGGGTGATATGTCTXH^ 
245 CTGCGCTCXATGTGGGTGAXAIXaTGTGGGGCCG 

CTGCGCTCTAcGTGGGtGATaTGTGTGGGGCCGTC^ 



SEP IP WO: Isolate 
35 DK12 
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35-39 



HK10 
S2 
S54 
S52 

consensus 



306 CAGACCtOGTCGCOVrCAAACaGTCCAG^CTtn^ 

mm 1 1 1 1 1 1 1 ii 1 1 1 1 1 miiiiiiiiiiiimmmmmmii m 

306 CAGMCgCGTCGCClKrCflftACGGrCCAGACCTGTAACTGCTCGCTGTftCCCAGGCCSUCT^' 

1 1 1 1 1 till 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 Illlllll Ml 

306 CAGACCTCGTCGCCATCAAACGGTCCAGACCTGTA^ 

■ llll ■ llllllllllllltllllllllllfltlllllfllllllllllllttllllllll 
306 CAGACCTGGTCGCCATCAAACGGTCCAGACCTCT 

MIM MNil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 

306 CftGACCrCXaTCGCCATCAAACGCTCCAGACCTGfTAACTGCTCG 
CAGACCtCGTCGCCATCAAACgGTCCAGACC^ 



SEP ID NO: 
35 

36 

37 

39 

38 

35-35 
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DKL2 

HKLO 

S2 

S54 

S52 

consensus 



367 TCAGGACATCGAATCGCITGGGOTATC 

mimiiimmii iiiiiini milium u minimi 

367 TCAGGACATCGAATGGCTTGGGATATGATGA^ 

Illlllll IMI 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II III 1 1 II II II 

367 tcArgw^ 

imiiiiiii inmmiiiiiiiiiii mniiiii urn 

367 TCAGGACATCGAATCGCrTGGGAT^ 

TffiTT^ffiTiiHiiiiiiiiii 1 1 1 1 1 1 1 1 1 1 1 ii i ii 1 1 ii i ii i ii 1 1 iii 1 1 1 1 ! i in 

367 TCAGG&CATCGAftTCGCTTGGGATJCTGAT 

TCAGGACATCGaATGGCTTGGGATATGATGATGA^ 
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S2 
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428 TaGCGCACXjTCCTGCXjtCfTOCCCCAGAC 

428 TaGCGCACGTCOT | | | | | | | | | || | | || II I I I I I I I I I H I I I I I I I I I I I I I I 

428 TCTCGCACGTCCTGCGgTTGCCCC^ 

miiiiiiii mil illinium imiiimmmimmiimimiiimii 

428 T^GCAj^tra 

MM MIM Mill II I II 1 1 1 M 1 IN II I i 1 1 1 1 I IMI I M I II MM I II 

428 'K^GCACATCCTGCGATTGCCCCAGACCTT^^ 

MM MIM I M 1 1 1 II I II II 1 1 1 M M II I M 1 1 II II II I II II 1 1 1 1 1 II II 1 1 1 II 

428 TX^GCACATCCTGCGATTGCCCCAGACL'nVm 

TgGCGCACgTcCTGCG- tTGCCCCAGACCtTGTTcGACATAaTaGCcGGGGCC»TrGGGG 
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CATCt^TGGCgGGCCTaGCCT3OT7lcTCcRTG^ 
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HK10 
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consensus 



550 ATGGTTATGTTTTCAGGaGTCGATCCC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ulillljl 

550 JCrGGTTR3CCTXTTCAGGGGTCGATGCC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 III 

550 ATGGraATSTTTTCRCGGGTCGAcGCC 

Ml III IMIIIM IIMlliMM 

550 ATGKlTAavrri TCAGGGGTCGATOCC 

MiiiiiijiiiMimjmMM^ 

550 ATGATlMGTTTTCAGGGGTCGaTGCC 
JVIXSgTTATGTTTTCAGGgGTCGAtGCC 
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1 CTcAACTATCaC^TGCCTC^ 

MINI I 1 1 llllilllllllillllillli I! I III Mill Mill II II II I Nil 

GrtAACTJUTgCAATGCCra^^ 



TP WO: ssoiftt;? 

43 Z7 



42 26 
42-43 consensus (Z6) 



62 TAaTGTATGAGGCCGAACACCACOT^ 

TTiTTmim i iiiiiiiiiiMi 



TAgTGTATGAGGCCGAACACC^ATCtTACACCTC 



CCTGTGTGAGGGt 



SEP TD WO: Isolate 
43 27 



42 



26 



123 tQGGAAtCAfiTCAOCKITGCTOGffTQQCCCT^^ 



42*43 consensus (26) 



tGGGAAtCAGTCACGCIX3CTGGGTCGCCCTTACTCCCACOT 
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42 26 
42-43 consensus (26) 

SEP ID NO: Isolate 

43 Z7 

42 26 
42-43 consensus (26) 



184 BKP^^ 

184 GCTCM^ 

GCtCCGCTTGAcTCCcTCCCKjM 

245 CcGCtCTCTACaTTGGGGACCTKnXSCGGTGGcGtACT 

ill flMM MM II Mill MINI I I ' 1 JiJJLJJJU 

245 C tt^CCTCTACgTTGGAGAt CTGTGCGGTGGTG cATTL 1 1 1X3G cCAGATGTTu x x 
CtGCCCTCTACgTTGGaGAtCTGTGCGGTGGtGcAT^ 
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306 CCAGCCGCGACGCCACTGGAC^^ 

mm " i |MI 1 

306 CX^CSOTA^ 

CCAGCCGCGACGCCACTGGACTACGCAGGACTGCAATT^ 
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367 ACaCXTCCACAGaATTCCAT^ 

II | II II Ml IIIIMIIIIIIMMMMIIII MIMMIMMMMI II I I 

367 ACg^OK^ 

MgGGCCAQVGgATGGCATGGGACATGATGA^ 
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428 TCGCCCA3GTtJ\XGAGGJVTCCCTAGCA^ 

1 1 1 1 1 1 1 1 1 1 IIIIIIIIIIIMIIIIMIIII II Mllll llllllllllllllll 

428 TCGCCCAGGTcATGAGGATCCCTAGCACTCTGGTaGAtCT 
TCGCCCAGGTcATGAGGATCCCT^CACTC^ 
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43 Z7 

42 Z6 

42*43 consensus (Z6) 



489 taTCCTTaTcGGGgTGGCaXACTTC t GCATGCAAGCTAATTGGGCCAAGGTCAT t CTGGTC 

Mill | Ml Mil MINI I llllllllllllllllllll lllll HUM 

489 CgTC CriTtfCTS GGtTGGCGTACCT 

cgTCCTTgTtGGGtTGGCgTACTTC^ 



SEP ID NO: isolate 
43 Z7 

42 Z€ 

42*43 consensus (Z6) 



550 CTTITCCrcraffiCTGGftCTTGaTGCC 
550 1-1111^ 

tCGCTGGAGTTGATGCC 
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1 GTt:CCCTACCGgAATGCCTOGGGGTTTAcCATGTCACCAATGA 

II llllllll I Mi II Ml Mil MM 1 1 i 1 1 1 1 1 i i 1 1 1 1 IIIMIIIMI Mil 

1 GTCCCCTJUrCGAAATGCCTCTGGGGTTI^TCA 

M 1 1 1 1 M I ! M 1 1 ! 1 1 1 M IMMMIMI M MMIMMMMMI 1 1 1 1 1 1 1 1 1 1 

l GTCCCCTJVCOaUATGCCTCcGGGGTTTnTCM 

n iiiiiiniii inn iiiiiiiiiiiiiiiiiiiiiiiiinii 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! M 1 1 M 1 11 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 1 M 1 1 i M 1 1 1 M M 

L GTTCCCTACCGAAAltKXTCTGGGGTTTOT 

inn him i 1 1 1 1 1 1 1 1 1 f 1 1 f iiiiiiii iiiiiiiiiiiimimmiii 

I CTTCCtTJ^CGgAATCCCTCTGGGGTgTRTCA^ 
GTtCCcT»CCGa^tGCCTCtGGGGTCTAt 
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SEO ID NO: 
45 
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47 


SA5 


123 


49 


SA7 


123 


46 


SA4 


123 


50 


SA13 


123 


48 


SA6 


123 


45-50 
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TftGTCTRCGAGGCTGATAgCCTGATctTGCACGCACCTGGcTGCGTGCCCrGTGTCAgGcA 

iiiiiiiiiiiiiiiin nun iiiiiiiiiiiii iiiiiiii iiiiiii i i i 

TABTCTA0QA6GCT62CTJU^CT6ATtCT6CACGCACCTGGnGCGT6CCCTXSTGTCAaGgA 

lllllll llllllll llllllll llllllll IIIIIIIIIIMMMIIIIIII I 

TAGTCTAtGAQGCTSAcAACCIGJCTCCTGCACGCACCTGGTTGCGTGCCCTGTGTCiU^CA 

Mil II 1 1 f f f 1 1 1 MINIMI Mil IIIIMIMIIIIMM llllllll II 

TAGTtTACGAGGCTGATAACCTGATClTGCA 

i ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii iiiiiiiiiiiiiiii mil 

TCGTCTACGACK3CTGATGACCTCATCTTACACGCACCTGGTTGCGTC 

i inn m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiiiiiiii 1 1 iii in i inn ii I 

TaGTCTAtGAGGCTGATGMCTGATCcTACACGCACCTGG cTGCGTG CCCTGTGTccGGaA 
TaGTcTAcGAGGCTGAtaaCCTGATc - TgCAcGCACCTGGtTGCGTGCCcTGTGTcaggcA 



AGaTAATOTCAGTAGGTGCTGGGTCCAAATCAC^ 

ii niiiniiiiiiiiiimiiiiiiiiiiiiiiiii 1 1 1 1 1 mi nil nil i 

AGqTAATGTCAGTAGGTGCTGGGTCCAAATCACCCCCACATTGTO^GCCCCGAACCTCGGA 

I t IM 1 1 M 1 1 1 M 1 1 M M 1 1 1 M 1 1 1 1 1 1 1 M 11 1 f 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 

AaATAATCrrCAGT3«3GTGCrrGGGTCCAAATCaCCCCCAC^ 

I 1 1 1 { 1 1 1 1 1 1 1 1 IIMMIIIMIIIMI lllllll MMIIMIIMM 1 1 III I 

aGATAATGTCAGXAaGTGCTGGGTCCAAATCACCCCCACgTTCTCMCCCCGAAtCTQi^ 

IMIIIIIIIII 1 1 1 1 ! 1 1 1 1 1 1 1 lllllllllll M 1 1 1 1 1 1 i I M llllll 

GGoTAATGTCAGTAGGTGCTOG&TCGAgATCACCCC 

Ml IIIMIIIMI llllllll M I I I II I I M I I I II 1 1 1 1 f 1 I I I I I 1 I I 1 1 1 I 
agaTJ^nCTCAGTAggTGCTGGGTcCAaATCACCCCCAJC^ - TgTCAGCCCCGAaccTCGGA 
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184 GCGGTCACGGCTCCTCTTCGGAGGGcCGT^ 

i in ii j ii ii mi ii ii ii ii ii i mm illinium ii imm iiiini 

184 GCGGTCACGGCTCCTCTTCGGAGGGtCGTTGACTA^ 



184 GCGGTCACGGCTCCTCTTCGGAGGGCCGTT^ 



184 GCGGTCACGGCTCCTCTTCGGAGGGCCGTT^ 

184 (Ua^Ta^^ 

GCGGTCACGGCTCCICTTCGGAGGGcCGTIX^^ 



245 CQjCACTAIACGTCGGcGACGCGTGCGGGGC^ 



245 TOKS^JaaaTTCGGGG 




245 casarrTOTAa^^ 

245 COTCTTT^^^ 

CCGC - CfTATACGTCGGgGACXjCGTGCGGGGCAgTGT^ 

306 TJ^CCTCGCCAGCATACc^CaGTGCAGGACTC 

306 laGGcrrcGCM 



306 TAGGCCTCGCCAGCACACrACGGTGCAGGAC^ 
306 jJ^C^CK^ 

306 tUc*ctc^ 

Ml 1 1 1 1 1 1 1 1 1 1 1 I II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IN 

306 TAGgCCTCGCCaGCJU'gCTacgGTaCAGGACraCAACTGcTC 
TAGgCCTCGCCaGCAtactacgOTgCAgGACTGCAAcTGtTC 

367 ACCGGCCACCGgATGGCtTTOGACATG^ 

lllllllllll UNI 1 1 1 INI INI IIIMII ill INI Ml INN MINI III 

367 ACCGGCCACCGAATGGCATGGGACATGATGATG 



367 ACCGGCCACCX^TGGCATGGGACAa^ 

367 jLasGCCMra^ 
367 JLcxaLrca^ 

II I M I II II 1 1 1 1 M 1 1 1 M 1 1 1 1 II II 1 1 II 1 1 1 M 1 1 1 1 1 I Mill I II II M 

367 ACtTCCCACMGATCGCATGGGACATGATGATGM 
ACcGGCCACCGgATO^aTGGGACAT^ 
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SEO ID NO: 
45 


Isolate 
SA1 


428 TGGCCCAGaTGCTACGGATcCCCCAgGTGGTCATaGACATCATaGCCGGG 

TtTi Till i 1 1 * ■ i i i i j i i i i t it i i i m i mi II it 1 1 1 1 1 1 1 1 1 1 1 i i 1 1 1 j 
MINI MINIMI! Mill 1 i 1 I I II I IIMI1II 1 1 II 1 1 M 1 1 M 1 1 1 1 1 
428 TOK^CAGQTGCTACGGATTCCCCAaGTOGTC^ 

Mllllll NlllliMIIIIIII IMMIM 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

428 TCGCCCAGTTGCTACGGATrcCCCAGGroGTCATC 

1 1 1 1 t 1 ] 1 1 I 1 1 1 1 I 1 1 1 1 t 1 I 1 I 1 1 t 1 I 1 1 I 1 1 1 1 t 1 1 1 1 1 t 1 1 1 1 1 I 1 1 1 t 1 1 1 I 1 1 I 1 
428 TtMCCCaGTTG^^ 

MIMIIMM IIMIIIIIIMIIIIMIMI 1 Illll llllllll II III Mil 1 1 

Tiii 1 1 1 ii iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiii iiiiiiin 

428 TGGCCCAa^TGC^IACGGATTCCCCAGGTGGTCATTGACAT^ 
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TGGCCO^TGcTACGGATtCCCC»gGTGGTCATtG 
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489 GGTCTTGTTtGCCGC^X3aMTTtGa3TC 

niiiiiii 1 1 1 1 iiiiiiiii mini ii mimiiiiiii iinimi 

I MMlTllM III HUM IIIIHII II I Mil INN JJ.III "J jl.t III I Ul L 
UMMIII 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 II 1 1 II I I III 

489 GGTCnXriTtGCCGCCGCATJOTTCGCGT 

1 11111111 iiiiiiiini i nun 1 1 ■ ■ 1 1 1 ■ ill ■ 1 1 iiiiii j i jll j 

GGTCTTGTTcGCCGccGCATAcTtcGCGT 
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550 CTGTTc 

Illll Nil 



550 
550 
550 
550 
550 



III 



C 

II 

fC 

I 

II 
II 

cc 

II 

:GATGCC 

- TGnTtCTGTTTGCGGGGGTcGATGcC 



IN II I llllllll llllll I 

TmTmiiT?^^ 
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1 GTGGAAGTcAGgAACAtCAGTTct^ 

1 ffTGGAGGTCAAGGACACCGGCGACTCCTACA^ 

1 GcccAAGTCAagAACACCAgtacCaGtflAcJ^^ 

1 cTAGAGTGGCGGAATacGTCtGGCCTCTAtgTCCT^ 

1 tAtGAaGTCCgCAACGTgTCCGGGgtgTACCAtG 

1 tJUTCAAGTgaXJtftfnCcaCgGGgCTtTrc 

1 GAGCACTACCGGAATGCTTCGGGCArCTAJX^^CA^ 

1 GTtAACTATCgCAATGCCTCGGGCGTCTAT^ 

1 TACAACIATCGCAACAGCTCGGGTGTCT^ 

1 GTGCACTACCGGAATGCTTCGGGCGTCTATC^^ 

1 GTTlCCcTACCGaAAtGCCTCtGGGGTtTAtCATGTcACC^^ 

1 CTTACCTACGGCAACTCCAGTGGGCrAIM 



AC M GA TG C AA 



62 TC^CTGGCAaCTCACCaACGCAGTtCTCC^ 

62 TCGTTTCGCAGCTTGAAGGAGCAGTGCTTC^ 

62 TCACcTGGCAaCTccAgGCcGCGGTcCTCCACGTcCCCGGGT Gt 

62 TtGTGTATGAGGCCGATGACGTcATTC^ 

62 TtGTGTatGAggCAgcgGACaTGATraTGCA^ 

62 TtGTGTACGAGgCgGCcGATgCcATcCTgCAcaCtCCgGGgTGTC 

62 TAGTCTATGAAGCTCACCATCACATCCTACACTTGCC^ 

62 TAgTGTATGAGGCCGAACMCAgATCtTACACCTCCCAGGGTCC 

62 TAGTCrATGAAACaSATTACC^^ 

62 TAGTGTACGAGACGGAGCACCACATCATGCACTTGCCAG 

62 TaGTcTAcGAGGCTGAtaaCCTGATctTgCAcGCACCT 

62 TCGTGCTGGAGGCGGRTGCIATGATCTTGCATTTGCC^^ 



T T CA 



CC GG TG T CC TG G 



123 cAATGGCACCcTGCgCTGCTGGArAOkAGTgAC^ 
123 CGCCAACGTCTCTCGATGTTGGGTGCa3GTTC 

123 aGGAAAtaCaTCtCGgTGCTGGATACCGGTctCaCCAAAcCTgGCcGTGCaGCa 

123 CGGcAAIACATCcAcGTGCTXSGACCcCaGTGACaCCXACaGT^ 

123 gaacAArtcCTCccgcTGcTGGGTaGCGCTcaCtCCCACgCT 

123 GGgTaaCgcctCGAggTCTTtKKnX3gCGgTGaCCCCCACgGTgG 

123 TGGGAACACATCGCGTTGCTGGACGCCGGTCACGCCTACA^^ 

123 tGGGAAt CACTCACGCTGCTGGGTGGCCCTTACTCCCACCGTGGCGG tG t 

X23 AGGGAACAAGTCTACATGCTGGGTGTCTCT» 

123 GGAGAATACTTCTOSCTGCTGGG^ 

123 agaTAATGTCAGTAggTGCTGGGTcCAaATCACCCC^ 

123 CGATCATCGGTCCACCTGTTGGCATGCT^ 



TG TGG 



T C CC A T C 
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i qa rr^rT^CTCAcAACCTGCGAaCaCACgTcGAcaTGOTcGTAA 
lit Eei25cM 

lit S^cSSlS^ 




KCTCTTGAGTCGrrCCSGCGACnTGTGG^ a ± 

lft4 GCtCCGCTTGAcTCCcTCCGG^ 
lit ^CGOTGAGTCmt^ 

Ht 



T G 



T GA 



T G 



GC 



TTG T 



III ^StcStoTTG^ 

245 Ct^CCTCTACgTTGGa^tCTGTGCGGTGG^ 
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